November 1, 2017

Ignoring Outliers Creates Racist Algorithms

Have you built an algorithm that mostly works? Does it account for almost everyone's needs, save for a few weird outliers that you ignore because they make up 0.0001% of the population? Congratulations, your algorithm is racist! To illustrate how this happens, let's take a recent example from Facebook. My friend's message was removed for "violating community standards". Now, my friend has had all sorts of ridiculous problems with Facebook, so to test my theory, I posted the exact same message on my page, and then had him report it.





Golly gee, look at that, Facebook confirmed the message I sent does not violate community guidelines, but he's still banned for 12 hours for posting the exact same thing. What I suspect happened is this: Facebook has gotten mad at my friend for having a weird name multiple times, but he can't prove what his name is because he doesn't have access to his birth certificate because of family problems, and he thinks someone's been falsely reporting a bunch of his messages. The algorithm for determining whether or not something is "bad" probably took these misleading inputs, combined it with a short list of so-called "dangerous" topics like "terrorism", and then decided that if anyone reported one of his messages, it was probably bad. On the other hand, I have a very western name and nobody reports anything I post, so either the report actually made it to a human being, or the algorithm simply decided it was probably fine.

Of course, the algorithm was wrong about my friend's message. But Facebook doesn't care. I'm sure a bunch of self-important programmers are just itching to tell me we can't deal with all the edge-cases in a commercial algorithm because it's infeasible to account for all of them. What I want to know is, have any of these engineers ever thought about who the edge-cases are? Have they ever thought about the kind of people who can't produce birth certificates, or don't have a driver's license, or have strange names that don't map to unicode properly because they aren't western enough?

Poor people. Minorities. Immigrants. Disabled people. All these people they claim to care about, all this talk of diversity and equal opportunity and inclusive policies, and they're building algorithms that by their very nature will exclude those less fortunate than them. Facebook's algorithm probably doesn't even know that my friend is asian, yet it's still discriminating against him. Do you know who can follow all those rules and assumptions they make about normal people? Rich people. White people. Privileged people. These algorithms benefit those who don't need help, and disproportionately punish those who don't need any more problems.

What's truly terrifying is that Silicon Valley wants to run the world, and it wants to automate everything using a bunch of inherently flawed algorithms. Algorithms that might be impossible to perfect, given the almost unlimited number of edge-cases that reality can come up with. In fact, as I am writing this article, Chrome doesn't recognize "outlier" as a word, even though Google itself does.

Of course, despite this, Facebook already built an algorithm that tries to detect "toxicity" and silences "unacceptable" opinions. Even if they could build a perfect algorithm for detecting "bad speech", do these companies really think forcibly restricting free speech will accomplish anything other than improving their own self-image? A deeply cynical part of me thinks the only thing these companies actually care about is looking good. A slightly more optimistic part of me thinks a bunch of well-meaning engineers are simply being stupid.

You can't change someone's mind by punching them in the face. Punching people in the face may shut them up, but it does not change their opinion. It doesn't fix anything. Talking to them does. I'm tired of this industry hiding problems behind shiny exteriors instead of fixing them. That's what used car salesmen do, not engineers. Programming has devolved into an art of deceit, where coders hide behind pretty animations and huge frameworks that sweep all their problems under the rug, while simultaneously screwing over the people who were supposed to benefit from an "egalitarian" industry that seems less and less egalitarian by the day.

Either silicon valley needs to start dealing with people that don't fit in neat little boxes, or it will no longer be able to push humanity forward. If we're going to move forward as a species, we have to do it together. Launching a bunch of rich people into space doesn't accomplish anything. Curing cancer for rich people doesn't accomplish anything. Inventing immortality for rich people doesn't accomplish anything. If we're going to push humanity forward, we have to push everyone forward, and that means dealing with all 7 billion outliers.

I hope silicon valley doesn't drag us back to the feudal age, but I'm beginning to think it already has.

October 10, 2017

My Little Pony And The Downfall Of Western Civilization

Our current political climate is best described as a bunch of chimpanzees angrily throwing feces at each other. Many people recognize that there are many problems with modern politics, but few agree on what those problems actually are. Liberals, republicans, Trump, millenials, capitalism, communism, too many regulations, too few regulations, gerrymandering, illegal immigrants, rich people, poor people, schools, governments, religion, secularism, the list goes on and on and on. These, however, are not problems, they are excuses. They are scapegoats we have conjured up from our mental list of things we don't like so we can ignore the ugly truth.

At some point, Americans are going to have to admit that the entire problem with this country has been summed up in a TV show for six-year-old girls designed to sell toys. My Little Pony was released seven years ago, and as the series has progressed, it has focused more and more on reforming villains instead of defeating them. Time and time again, ponies refuse to give up on those who seem lost to darkness and try to figure out what is causing them to lash out. A central theme of the show is that almost no one is truly a "villain", only misguided, misunderstood, or pushed towards lashing out after a traumatic experience. While a select few villains have been show to be truly irredeemable, this is rare, which is in line with reality. Only a very small percentage of the human population is deliberately evil as opposed to simply lashing out, being stupid, or being tragically misinformed.

It is no longer possible to argue with anyone anymore. This is not because everyone suddenly became incapable of critical thought, but because no one can agree on any facts. When we built the internet, we thought having access to the sum of all human knowledge would bring infinite prosperity, but instead it brought us infinite misinformation. Russians have been making this worse, feeding false narratives to both sides to make us hate each other. It worked. When scientists themselves have been manipulated by corporations without anyone bothering to attempt to reproduce experiments, there simply isn't any good way to verify the truth of anything. The entire point of the scientific method is to make sure something is reproducible, but a staggering number of retractions in recent years has demonstrated that barely anyone actually checks anyone else's work. This has fostered a general distrust in science, even for results that have been reproduced thousands of times, like the thoroughly debunked "Vaccines cause autism!" claim.

This kind of political polarization is untenable. We no longer live in a world where we can get into a fight, stab someone else with a sword and call it good. If our political polarization goes unchecked, it will result in nothing less than the total collapse of western civilization. Humanity will have proven itself too dumb and tribalist to wield a tool as powerful as the internet. Whatever civilization comes after us will struggle to reclaim the technological progress we enjoy, now that we've stripped the planet of resources. If we fail now, humanity will never again be capable of reaching for the stars. We will have sentenced ourselves to live on this small rock until the sun boils the oceans away, doomed by our own stupidity.

There was an episode of My Little Pony that explained the origin of their country, Equestria. The three races, Pegasi, Earth ponies, and Unicorns, hated each other. Evil forces fed on their hatred, smothering the land in snow and threatening to freeze them all to death. So, each race set out to find a new land to colonize, only to suddenly realize that each of them had wound up on the exact same new continent. The leaders of each race immediately started yelling at each other, and the blizzard returned, until each of them was encased in ice. Only when the assistants of each leader realized they didn't actually hate each other were they able to dispel the evil forces by starving them of hatred and talking sense into their leaders. The moral of this story is very clear: either we figure out how to get along, or we're all going to die. I really, honestly don't know how to explain this better than a saturday morning cartoon show about magical ponies.

The problem is that I don't know if humanity is capable of moving past this. I've seen one of my friends rapidly devolve into insane conspiracy theory nonsense, and I simply don't have the mental willpower to engage with them. I eventually had to block them, and accomplished two things at once: reinforcing their echo chamber and reinforcing my echo chamber. I tried to hold on to them as a window into conservative nonsense so my twitter wasn't a complete echo chamber, but when the other side is saying truly horrible things about you, this becomes more and more difficult. It also make it more likely for me to say truly horrible things about the other side, in a vicious, endless cycle, and I simply have too many other things to worry about to be capable of dealing with that level of toxicity. At this stage, I'm not sure humanity has the strength to actually reconcile with itself. I think there is a real possibility that our worries about AI were misplaced - the technology that might ultimately destroy us could be the internet, simply because our tribalist brains are too desperate to find someone to hate. We may be fundamentally incapable of sustaining a global, interconnected society with instantaneous communication.

At least then we'll have a very definitive answer to the Fermi Paradox.

September 5, 2017

I Used To Want To Work For Google

A long time ago I thought Google was this magical company that truly cared about engineering and solving problems instead of maximizing shareholder value. Then Larry Page became CEO and I realized they were not a magical unicorn and lamented the fact that they had been transformed into "just another large company". Several important things happened between that post and now: Microsoft got a new CEO, so I decided to give them a shot and got hired there. I quit right before Windows 10 came out because I knew it was going to be a disaster. More recently, it's become apparent that Google had gone far past simply being a behemoth unconcerned with the cries of the helpless and transformed into something outright malevolent. It's silenced multiple reporters, blocked windows phone from accessing youtube out of spite, and successfully gotten an entire group of researchers fired by threatening to pull funding (but that didn't stop them).

This is evil. This is horrifying. This is the kind of stuff Microsoft did in the 90s that made everyone hate it so much they still have to fight against the repercussions of decisions made two decades ago because of the sheer amount of damage they did and lives they ruined. I'm at the point where I'd rather go back to Microsoft, whose primary sin at this point is mostly just being incompetent instead of outright evil, rather than Google, who is actually doing things that are fundamentally morally wrong. These are the kinds of decisions that are bald-faced abuses of power, without any possible "good intention" driving them. It's vile. There is no excuse.

As an ex-Microsoft employee, I can assure you that at no point did I think Microsoft was doing something evil while I was there. I haven't seen Microsoft do anything outright evil since I left, either. The few times they came close they backed off and apologized later. Microsoft didn't piss people off by being evil, it pissed people off by being dumb. I was approached by a Google recruiter shortly after I left and I briefly considered going to Google because I considered them vastly more competent, and I still do. However, no amount of engineering competency can make me want to work for a company that actively does things I consider morally reprehensible. This is the same reason I will never work for Facebook. I've drawn a line in the sand, and I find myself in the surprising situation of being on the opposite side of Google, and discovering that Microsoft, of all companies, isn't with them.

I always thought I'd be able to mostly disregard the questionable things that Google and Microsoft were doing and compare them purely on the competency of their engineers. However, it seems that Google has every intention of driving me away by doing things so utterly disgusting I could never work there and still be able to sleep at night. This worries me deeply, because as these companies get larger and larger, they eat up all the other sources of employment. Working at a startup that isn't one of the big 5 won't help if it gets bought out next month. One friend of mine with whom I shared many horror stories with worked at LinkedIn. He was not happy when he woke up one day to discover he now worked for the very company he had heard me complaining about. Even now, he's thinking of quitting, and not because Microsoft is evil - they're just so goddamn dumb.

The problem is that there aren't many other options, short of starting your own company. Google is evil, Facebook is evil, Apple is evil if you care about open hardware, Microsoft is too stupid to be evil but might at some point become evil again, and Amazon is probably evil and may or may not treat it's employees like shit depending on who you ask. Even if you don't work directly for them, you're probably using their products or services. At some point, you have to put food on the table. This is why I generally refuse to blame someone for working for an evil company because the economy punishes you for trying to stand up for your morals. It's not the workers fault, here, it's Wall Street incentivizing rotten behavior by rewarding short-term profits instead of long-term growth. A free market optimizes to a monopoly. Monopolies are bad. I don't know what people don't get about this. We're fighting over stupid shit like transgender troops or gay rights instead of just treating other human beings with decency, all the while letting rich people rob us blind as they decimate the economy. This is stupid. I would daresay it's almost more stupid than the guy at Microsoft who decided to fire all the testers.

But I guess I'll take unrelenting stupidity over retaliating against researchers for criticizing you. At least until Microsoft remembers how to be evil. Then I don't know what I'll do.

I don't know what anyone will do.

August 6, 2017

Sexist Programmers Are Awful Engineers

Men and women are fundamentally different. So are white people and black people and autistic people and gay people and transgender people and conservatives and liberals and every other human being along every imaginable axis of discrimination. Some of these differences are cultural. Others are genetic. Others depend on environmental factors. These differences mean that some of us are inherently better at certain tasks than others. On average, men are better at spatial temporal reasoning, women are better at reading comprehension and writing ability, and psychopaths can sometimes be excellent CEOs.

Whenever I meet a programmer who insists on doing everything a certain way, the chances I'll hire them drop off a cliff. Just as object-oriented programming didn't fix everything, neither will functional programming, or data-oriented programming or array-based programming or any other language. They are different tools that allow you to attack a problem from different directions, much like we have different classes of algorithms to attack certain classes of problems. Greedy algorithms, lazy evaluation, dynamic programming, recursive-descent, maximum flow, all of these are different ways to approach a problem. They represent looking at a problem from different perspectives. A problem that is difficult from one angle might be trivial when examined from a different angle.

When I stumbled upon this anti-diversity memo written by a Google employee, I wonder just how dysfunctional of an engineer that person is. Problems are never solved by being closed-minded. They are solved by opening ourselves to new possibilities and exploring the problem space as an infinitely-dimensional fabric of possible configurations. You do not find possible edge-cases by being closed-minded. You find them by probing the outer edges of your solution, trying to find singularities and inflection points that hint at unusual behavior.

You cannot build a great company by hiring people who are good at the same things you are. Attempting to maximize diversity only comes at a perceived cost of aptitude if you are measuring the wrong things. If your concept of what makes a "good programmer" is an extremely narrow set of skills, then you will inevitably select towards a specific ethnicity, culture, or sex, because the tiny statistical differences will be grossly magnified by the extremely narrow job requirements. Demand that all your programmers invert a binary tree on a whiteboard and you'll filter out the guy who wrote the software 90% of your company uses.

If you think the field of computer science is really this narrow, you're a terrible programmer. Turing completeness is a fundamental property of the universe, and we are only just beginning to explore the full implications of information theory, the foundations of type theory, NP-completeness, and the nature of computation itself. Disregarding other people because they can't do something without ever considering what they can do will only hurt your team, and your company. Diversity inclusion programs shouldn't try to hire more women and ethnic groups because they're the same, they should be trying to hire them because they are different.

When hiring someone to complete a job, you should hire whoever is the best fit for the job. In a vacuum where there is a single task that needs to be completed, gender and ethnicity should be ignored in favor of a purely meritocratic assessment. However, if you have a company that must respond to a changing world, diversity can reveal solutions you never even knew existed. An established company like Google must actively seek to increase diversity so that it can explore new perspectives that may give it an edge over its rivals. They cannot select on a purely meritocratic basis, because all measures of merit would be based on what the company is already good at, not what it could be good at. You cannot explore new opportunities by hiring the same people.

Intelligent people value feedback from people who think differently than them. This is why many executives will deliberately hire people they disagree with so they can have someone challenge their views. This helps avoid creating an echo-chamber, which is the ultimate irony of a memo that's called "Google’s Ideological Echo Chamber", because scrapping the diversity inclusion programs as the memo suggests would itself create a new echo-chamber. You can't remove an echo-chamber by removing diversity - the author's premise is self-defeating. If they had stuck with only claiming that conservative ideologies should not be discriminated against, they would have been correct. Unfortunately, telling everyone they shouldn't discriminate against your perspective, which itself involves discriminating against other perspectives, is by definition a contradiction.

We aren't going to write better programs by doing the same thing we've been doing for the past 10 years. To improve is to change, and those who seek to become better software engineers must themselves embrace change, or they will be left behind to rot in the sewers of forgotten programs, maintaining rancid enterprise code for the rest of their lives. If we are unwilling to change who is writing the programs, we'll be stuck making the same thing over and over again. A business that thinks nothing needs to change is one ripe for disruption. If you really think only hiring white males who correctly answer all your questions about graph theory and B-trees will help your business in the long-term, you're an idiot.

July 30, 2017

Why I Never Built My SoundCloud Killer

While the news of SoundCloud imploding are fairly recent, musicians and producers have had a beef with the music uploading site's direction for years. I still remember when SoundCloud only gave you a paltry 2 hours worth of free upload time and managed to convert my high quality lossless WAV files to the shittiest 128 kbps MP3 I've ever heard in my life. What really pissed me off was that they demanded a ridiculous $7 a month just to double your upload time. This is in contrast to Newgrounds, a tiny website run by a dozen people with an audio portal built almost as an afterthought that still manages to be superior to every single other offering. It gives you unlimited space, for free, and lets you upload your own MP3, which allows me to upload properly encoded joint-stereo 128 kbps MP3 files, or much higher quality MP3s for songs I'm giving out for free.

Obviously, Newgrounds is only able to offer unlimited free uploads because the audio portal just piggybacks on the rest of the site. However, I was so pissed off at SoundCloud's disgusting subscription offering that I actually ran the numbers in terms of what it would cost to store lossless FLAC encodings of songs using Amazon S3. These calculations are now out of date, so I've redone them for the purposes of this blog.

The average size of an FLAC encoded song is around 60 MB, but we'll assume it's 80 MB as an upper-bound, and to include the cost of storing the joint-stereo 128 kbps streaming MP3, which is usually less than 10% the size (using OPUS would reduce this even more, but it is not supported on all browsers yet). Amazon offers the first 50 TB of storage at $0.023 per gigabyte, per month. This comes down to about $0.00184 per month, per song, in order to store the full uncompressed version. Now, obviously, we must also stream the song, but we're only streaming the low-quality version, which is 10% the size, which is about 7 MB in our example (7 MB + 70 MB is about 80 MB for storage). The vast majority of music producers on the website have almost no following, and most will be lucky to get a single viral hit. As an example, after being on SoundCloud for over 7 years, I have managed to amass a mere 100000 views total. If I somehow got 20000 views of my songs every single month, the total cost of streaming 140 GB from Amazon S3 at $0.05 per GB would be $7 per month. That's how much SoundCloud is charging just to double my storage space!

This makes even less sense when you calculate that 6 hours of FLAC would be 4.7 GB, or about 5 GB including the 128 kbps streaming MP3s. 5 GB of storage space costs a pathetic $0.12 cents a month to store on Amazon S3! All of the costs come down to bandwidth, which is relatively fixed by how many people are listening to songs, not how many songs there are. This means, if I'm paying any music service for the right to store music on their servers, I should get near unlimited storage space (maybe put in a sanity check of 10 GB max per month to limit abuse). I will point out that Clyp.it actually does this properly, giving you 6 hours of storage space for free and unlimited storage space if you pay them $6 a month.

Unfortunately, Clyp.it does not try to be SoundCloud as it has no comments, no reshares, and well, not much of anything, really. It's like a giant pastebin for sounds where you can follow people or favorite things. It's also probably screwed.

Even though I had a name and a website design, I never launched it because even if I could find a way to identify a copyrighted song via some sort of ContentID system, I couldn't make it work without the record industry's cooperation. The problem is that the system has to know what songs are illegal in order to block them in the first place. Otherwise, people could upload Justin Bieber songs with impunity and I'd still get sued out of existence. The hard part about making a site like SoundCloud isn't actually making the website, it's dealing with the insane, litigation-happy oligarchs that own the entire music industry.

SoundCloud's ordeal is mentioned in this article. Surprisingly, it took until 2012 for them to realize they had to start making deals with the major music labels. It took until 2014 for many of those deals to actually happen, and they were not in SoundCloud's favor. A deal with Warner Music Group, closed in 2014, gave Warner a 3-5% stake in the company and an undisclosed cut of ad-revenue, just so SoundCloud could have the privilege of not being sued out of existence. This wasn't even an investment round, it was just so SoundCloud could have Warner Music Group's catalog on the site and not get sued!

At this point, you have to be either very naive or very rich to go up against an industry that can and will send an army of lawyers after you. The legal system is not in your favor. It will be used to crush you like a bug and there is nothing you can do about it, because of one fundamental problem: You can't detect copyright infringement without access to the original copy.

Because of this, the music industry holds the entire world hostage with a kind of Catch-22: They demand you take down all copyright infringing material, but in order to figure out if something is copyright infringement, you need access to their songs, which they only give out on their terms, which are never in your favor.