There’s no reason why we should all get the same recommendations when looking for a place to eat, drink or shop. Getting a one-size-fits-all list of places may have been innovative in 2006, but it feels downright antiquated now. Our tastes are all different, so why should we all see the same…
The Django project, a well-known Python webapp framework, recently changed all its uses of the database terminology “master/slave” to the alternative terms “primary/replica”.
The “master/slave” terminology has been in use for many years in the database world, but those terms can have racially-charged and offensive connotations. So I wholeheartedly applaud Django for getting rid of them.
However you may be wondering what on earth those terms were used to describe in the first place. To explain, we need a little background on database systems.
Persistence and Consistency
"Database" is one of those generic computer science/engineering terms that applies to a bunch of different things. However in this context we’re talking about software that stores and retrieves organized collections of data.
Two of the most important properties of a database system are persistence and consistency. Persistence means that the database doesn’t lose data. Consistency means that data always obeys any rules required of it. For example, a voting system will likely require that every user can only vote once. It turns out to be very tricky to guarantee both persistence and consistency in a reliable way, a topic I may return to in a future post.
Persistence typically requires that the data be stored on more than one machine. Otherwise the failure of a single machine may cause data to be permanently lost. This is known as replication.
Replication makes achieving consistency harder. For example, if two different machines have different versions of the data, which is the true version? One common way of dealing with this is “primary/secondary” replication.
In Primary/Secondary replication, one of the machines in the system is chosen as a primary or master, and all the other machines are designated as secondaries or replicas. All write operations (creating, updating and deleting data) go to the primary. After processing the write operation, the primary sends a message to each of the secondaries telling them to apply the same write operation.
In this scheme the secondaries are always a little bit “behind” the primary. The time between processing a write on the primary and processing it on a secondary is called the replication lag of that secondary. Because the secondaries may not be completely up-to-date, the primary is considered to be the “source of truth”. If you need absolute consistency you must read your data from the primary. However if you can tolerate a little staleness, you can read from the secondaries, and take some of the load off the primary.
If the primary breaks down, one of the secondaries can be promoted to become the new primary. However before it can do so it must process all the outstanding replication messages it received from the old primary, to ensure that it’s up to date. The process of choosing a secondary to promote is called master election.
What’s in a Name?
As you can see, the old “master/slave” terminology is not only offensive but also a misleading analogy (can a “slave” be promoted to “master”?). The term “replica” is both inclusive and informative, and using it is a win-win. Kudos to Django for this change I hope many more projects follow suit.
Since we announced Swarm last week, hundreds of thousands of people have signed up to be notified when it launches. (Thank you!) We’ve also had a bunch of people ask what this means for the playful parts of Foursquare, like points, badges, and mayorships. We wanted to take a second to explain…
We spend a lot of time talking to people about Foursquare, and we constantly hear they use Foursquare for two things – to keep up and meet up with their friends, and to discover great places. Every month, tens of millions of people open up the app to do each.
“They have no character. They have no guts. They lack courage. I’m old school. If I’ve got something to say, I’ll say it to your damn face. A lot of times people don’t like that, and they’ll punch you. But that’s their opportunity, and that’s the way you do business in this life: You say it to their face.”
A recent episode of This American Life told the story of Gene Cooley, a resident of the small town of Blairsville, GA whose reputation was destroyed by vicious anonymous posts on Topix. The quote above is from Cooley’s lawyer, who successfully sued to unmask Cooley’s online tormentors. The segment is fascinating and well worth a listen, especially given the rise of Secret.
Secret, the recently launched “anonymous twitter” app, has taken the tech industry by storm with its combination of personal confessions, trolling, Silicon Valley inside baseball, and bashing and trashing of companies and people.
Anonymous discourse has its place: It provides a safe space in which vulnerable people can express themselves without fear. It can give a voice to people marginalized by incumbent power structures in other media. But anonymity also creates a place where lies, slander and bullying go unchecked.
In a notable recent example, someone published an anonymous Medium post attempting to discredit Julie Ann Horvath and her claims of harassment at GitHub. Contrast this with Horvath’s courage in discussing her experience. She described specific incidents, named names, and put her own credibility on the line to do so. She took a risk in order to speak out, and I don’t doubt that she’s been threatened and intimidated for it. “Jane Doe” on the other hand, wants to trash Horvath without putting anything on the line, and it’s hard to believe anything he/she says when there are no consequences to lying.
Anonymity can be a corrective against privilege. When the deck is stacked against you, it may give you a safe way to be heard. But, on Secret and elsewhere, it’s too often used in exactly the opposite way. And this is especially grating when so many courageous people, especially women, do in fact speak up publicly, often at great risk.
If you’re going to attack someone publicly, even if you’re convinced you’re telling the truth, use your real name, and stand behind your allegations. Because that’s the way you do business in this life: You say it to their face.
In my last CS101 post we began to discuss different programming language paradigms, and how they affect the performance, correctness and universality of programs. In this installment we’ll provide more detail on one of the most important distinctions between programming languages, namely how they provide modularity.
Programs can be huge: Even a modest webapp may consist of hundreds of thousands of lines of code. One of the main concerns a programmer has is how to structure a program so that it’s readable and maintainable by herself and others. Having one gigantic lump of code is as untenable as trying to build a house from a single giant pile of bricks, lumber and hardware.
Instead, much of the art of programming consists of breaking the program down into components, or modules, and then further breaking down those modules into submodules, and so on down. Modules provide organization and structure through separation of concerns, i.e., a single module is responsible for a single area of functionality.
Git, GitHub and the Ethics of Engineering Collaboration
GitHub is in the news right now, and not in a good way: Engineer Julie Ann Horvath has resigned from the company, alleging serious incidents of harassment and inappropriate behavior.
People seem to be focusing on some of the more prurient allegations - office hula-hooping and so on - but are largely ignoring what to me might be the worst offense: another engineer “passive-aggressively ripping out [her] code from projects [they] had worked on together without so much as a ping or a comment.”
Unfortunately, outside the engineering community it’s not always obvious what GitHub is, and why this kind of behavior, bad anywhere, should be especially unacceptable there. But to understand GitHub, you first need to understand how software engineers work together.
In my last CS101 post I described how programming languages are an intermediary between human language and machine code: the logic operations implemented by a computer’s circuits. In this, and my next few posts, we’ll look at programming languages in more detail, and discuss different language designs and capabilities.
The earliest and simplest programming languages were assembly languages. An assembly language is just a slightly more human-readable version of a system’s machine code. It uses mnemonics to represent individual logic instructions, and allows the programmer to use labels to refer to memory locations. A program called an assembler turns assembly language code into proper machine code.
Y Combinator co-founder Paul Graham got into some hot water recently for controversial comments about women in tech. This follows a previous dodgy statement of his about “founders with foreign accents”, and another saying “I can be tricked by anyone who looks like Mark Zuckerberg”.
I’ve never met Graham, but by all accounts he’s a well-meaning guy who doesn’t maliciously discriminate. And he acknowledges at least some of his prejudice, which is more than most people who worship at the cult of meritocracy are willing to do. Unfortunately, however, the underlying bias emerging from Graham’s statements is common in Silicon Valley.
This bias is typically not due to an intent to exclude women, immigrants, people of color or older people. Instead it’s the result of an unhealthy and extreme reverence for one, and only one, archetype: The Hacker.
Boyz n the Hoodie
The Hacker archetype is a 20-something, hoodie-wearing, white male. He grew up with enough economic privilege to be able to teach himself programming as a teen. He’s so consumed by computing that he no time for such trivia as social skills or a dress sense. His drive and ambition to “innovate”, “disrupt” and “change the world” leave him with little patience for rules or standards of conduct. His mantra is “Move Fast and Break Things” (especially other people’s things). He’s the Silicon Valley realization of two tropes rolled into one: the pioneer and the self-made man (who is almost always a man, and almost never self-made).
The platonic form of The Hacker is, of course, Mark Zuckerberg.
Now, Paul Graham claims to have been misquoted. I take him at his word that his comments about women were only intended in a narrow context. But his correction, with its further harping on and on about “hacker this” and “hacker that”, is actually more revealing than his original statement. All his statements above, including this correction, make it abundantly clear that, to him, the only kind of person who counts is The Hacker. And Graham is far from alone in this thinking.
When Did a Hack Become a Good Thing?
In journalism, a “hack” is a pejorative term for a writer who churns out large amounts of low-quality content. In programming, a “hack” denotes an inelegant solution, typically to band-aid over a problem until a more comprehensive solution can be written.
Yet somehow, in the past decade, “hacker” became a compliment. Facebook, for example, built its entire corporate culture around “The Hacker Way”, a mish-mash of truisms about fast iteration and prototyping, such as “done is better than perfect”.
Graham takes this even further, staking out a distinction between CS majors and hackers, to the detriment of the former. For example:
The problem with that is I think, at least with technology companies, the people who are really good technology founders have a genuine deep interest in technology. In fact, I’ve heard startups say that they did not like to hire people who had only started programming when they became CS majors in college.
Somehow, being a self-taught, college-dropout programmer with no industry experience has become not a liability but a badge of honor. This is a great shame, because true technological innovation often requires knowledge, experience and maturity.
Many conversations about “tech” are actually about products, or worse, about money. Modest UX tweaks are frequently lauded as “innovation”. But there’s also a lot of truly heavy lifting to be done in the tech industry, and this requires expertise, talent and rigor, qualities that we must look beyond the “hacker pool” to find.
It’s hard for an early-stage investor to predict eventual returns based on little more than a pitch deck. There are few objective measures by which to judge an early-stage startup. So VCs fall back on “intuition”, sometimes more honestly referred to as “pattern matching”. And what better pattern to match than ur-hacker Zuck, the founder of a company that went from $0 to $100B in eight years?
The trouble is, what’s really going on is mostly just confirmation bias and selection bias, and on a hopelessly small sample size at that. “Pattern matching” is really just an anodyne synonym of “prejudice”.
It may not look like prejudice, because the focus is less on what you are (a woman, a person of color, over 40) than on what you’re not (a Hacker). So it may not be grounded in overt sexism or racism, but it’s all the more insidious for that. At least with Pax Dickinson you know what you’re getting into. It’s harder to deal with discrimination that isn’t aware of its own existence.
Hacking The Hacker
This prejudice’s obsession with a single archetype is also its weak spot: Deconstruct the Hacker and you weaken the bias it engenders. By challenging various aspects of this archetype we can reduce its near-mystical significance in Silicon Valley. Take away the pattern, and pattern matching becomes much harder.
So think of this as a call to arms: Let’s hack The Hacker!
It’s not that conforming to the Hacker archetype is bad of itself. It’s that mythologizing just one type of person necessarily excludes others from access to capital, jobs and other resources. Not to mention the fact that it also creates a lot of false positives: bad ideas that get funding because the founder “looked right”. And such poor allocation of capital is bad for everyone: investors, hackers and the rest of us.
So the goal is not to take down any individual, but to rid the Hacker ethos of its glamor. To say that it’s fine to be a Hacker, and equally fine not to be one. Whatever your background, and however you got to where you are, investors like Graham should have open minds about you and your ideas.
As we’ve discussed in previous installments, computer programs are sequences of instructions that tell computers what to do. Any software that runs on a computer - be it Mac OS X, Google Chrome, the Apache web server or Candy Crush Saga - is a program, and someone has to write that program.
The problem is that people speak English(*) while computers understand the 0s and 1s that trigger their circuits. This is where programming languages come in.
Meeting The Computer Half Way
A programming language is an intermediary between English and the low-level instructions computers understand. It’s a compromise between the looseness of natural languages and the structured formality required for machine interpretation.
For example, a human might say:
I want to print all the whole numbers from 0 to 9.
And another human might express the same idea with a different sentence:
I want to print all the non-negative integers less than ten.
This loose informality of natural language makes it unsuitable for communicating with computers. A computer natively understands only the very low-level machine code instructions baked into its circuits. But humans can’t easily compose machine code directly.
Instead, the human writes instructions as a program, in a programming language.
For our example, we’ll use the programming language Python:
for number in range(0, 10): print(number)
This program is structured enough for a computer to interpret, but also “English-y” enough for a human to write. In this case, even with no programming experience at all, you can probably figure out what it means.
If your computer has Python installed you can see for yourself that computers have no problem understanding this program. On a Mac, go to Finder -> Applications -> Utilities -> Terminal, type
python -c “for number in range(0, 10): print(number)”
into the terminal window and hit enter.
Compilation and Execution
What’s going on here?
"python" is a command (**) that runs programs written in the Python programming language. When you run it as above it does two things:
Compiles the program.
Executes the compiled program.
Compilation is the act of turning the program from Python into machine code. Execution is the act of applying the machine code to the computer’s circuits. If the program is written correctly then the execution will yield the result the programmer intended.
The program above compiles into machine code that looks something like this:
Note that even this machine code is illuminated using English-y words. What the computer really sees, of course, is just a lot of zeros and ones.
Programming is the art and science of turning informal ideas into a description just formal enough for a computer to understand. The computer then takes it the rest of the way:
In my next post I’ll discuss some of the programming languages in common use today (and also explain that I slightly cheated in describing the output of Python compilation as machine code).
(*) No anglocentrism intended. But in practice programming languages are always English-based.
(**) This so-called command is itself a program: a special kind of program whose job is to run other programs! You may be wondering what language the “python” program itself is written in? It can’t be written exclusively in Python because then what would run it? This and more will be discussed in my next post.
Almost everything we do at Foursquare is heavily data-driven. Our over 4.5 billion check-ins represent the living pulse of a city. If you haven’t seen these gorgeous visualizations, I highly recommend them. They compellingly emphasize the immense value of check-in data.
Check-in data (and also other data such as tips) power local search in Explore, allowing us to provide specific, personalized recommendations based on everything from the past behavior of you and your friends to the time of day.
Check-in data also powers the very precise ad targeting that is key to the success of our fast-growing ads business.
But don’t just take my word for it. If you’re interested in how we leverage our data to power both search and monetization, come to our San Francisco tech talk event on Tuesday, October 15th, at 7:00 PM.
Perhaps equally importantly, there will be dinner and drinks… Not to mention a chance to mingle with engineers from Foursquare and other top tech companies (including many who use the Foursquare API. Just sayin’…)
Interested? Sign up here, and see you on the 15th!
"We set sail on this new sea because there is new knowledge to be gained, and new rights to be won, and they must be won and used for the progress of all people."
- President John F. Kennedy, “Moon Speech” Rice University, September 12th 1962
Much of the coverage of Calico, the Google-backed venture to extend human longevity, has focused, sometimes skeptically, on the end goal. People are asking: can anyone - even Google - really defeat aging?
That question misses the point.
Larry Page has referred to Calico as “moonshot thinking around healthcare and biotechnology”, and that metaphor is no accident. The original moonshot did achieve its end goal. But, equally importantly, it triggered a wave of basic R&D that transformed the technology landscape.
The huge budgets ($25 billion, or well over $100 billion in today’s dollars) spent on the Apollo program in the 1960s had three complementary benefits beyond the direct achievement of landing a human on the moon:
R&D Infrastructure. The Apollo program funded the construction of research facilities, such as the Johnson Space Center and the Center for Space Research at MIT, that are still in use 50 years later. The resources and disciplines developed during the Apollo program are still yielding important scientific discoveries.
Spinoff Technologies. The Apollo program, and the space shuttle program that followed it, created a great many technologies that to this day impact our lives in myriad ways. Particularly notable are the huge advances in computing and telecommunication.
Promoting Science. The Apollo program drove a public fascination with science that lasted a generation. It unabashedly placed technology on a pedestal, and promoted the causes of rational thinking and scientific discovery.
A biomedical “moonshot” program like Calico could drive similar benefits, albeit on a smaller scale, given the more modest budgets. Promoting basic R&D in biochemistry, robotics and other sciences may yield spinoff technologies that will benefit our lives and capture our imagination, regardless of whether it actually achieves a large increase in human longevity.
A “failure” of Calico may still be a huge success, and we should weight the merits of this new venture just as much by the supposedly ancillary benefits as by the progress towards the end goal.
Fittingly, the venture of extending life has this in common with life itself: the true purpose is not the destination but the journey.
It’s often the case that the true value of a startup lies not with its technology, or even with its user base, but with its data. When millions of people use your service every day, you almost can’t help gathering large amounts of interesting data about what they do. For example:
Google knows what people search for, and which results they click on.
Foursquare knows what people are looking for nearby, and where they end up going.
Uber knows where people need rides, and where those rides take them.
It doesn’t take much to see the value of this data: Google can rank the results people actually click on higher in future searches, Foursquare can use check-ins to make better, more personalized recommendations, and Uber can use ride data to predict demand and ensure an adequate supply of cars at needed locations.
I wouldn’t normally see a Frat Pack comedy in theaters. That’s what mainscreen entertainment on United Airlines flights is for. But as a Google alum, I was curious to watch the The Internship. So I went to see it this weekend, with a group of current and former Googlers. Spoiler alert: It’s terrible.
[Update 7/1/2013: Since BI decided to link to this post, and at least one person who worked on the movie took offense, I’ve revisited this and have some clarifications.
I regret the use of the word ‘terrible’. From a craft perspective the movie is actually very well-made. The BI writer is correct in saying that my objections pertain to sociology, not moviemaking. But when the movie mocks ‘my people’ as laughable stereotypes, I take that personally.
There is PLENTY to poke fun at in Silicon Valley. Our inflated sense of self-importance, for one. In fact, the best moment of the movie is when Max Minghella’s character, assembling his rival team, asks an intern:
- “Where did you go to school?” - "The University of-" - “No”.
That had the ring of truth to it. It was a deft poke at Google’s (former?) obsession with academic excellence.
But the movie made too few forays down that path, and instead went mostly for stereotype-pandering, especially of women. If you’re offended by my review then I’m sorry, but I’m also offended by your movie…]
To an ex-Googler the movie may be mildly entertaining. Not because it’s particularly funny, but as an extended game of “spot the cafeteria”. And I don’t mind the obvious nonsense, such as the Hunger Games-like intern job competition, or the apparent lack of any distinction between different roles at a company. I can stomach those as fictions necessary to create a story. No, what makes The Internship excruciating is the lazy pandering to every imaginable Silicon Valley stereotype.
The human digestive system is wondrous. Complex organs and glands process a wide variety of foods, channeling energy and nutrients into the bloodstream while diverting waste out. We each walk around with an amazing little factory inside us.
Creationists use complex biological systems like these to argue for the existence of a divine creator. They say that no evolutionary process could have created something so marvelous. And it does indeed seem miraculous. Rather, they claim, these have to be the product of ‘intelligent design’ (ID).
However, on closer inspection, the digestive system does exhibit some puzzling design choices. For example, the digestive and respiratory systems share an entrance: Every bite of food we eat passes perilously close to the trachea, stopped only by the epiglottis contracting when we swallow. And indeed, thousands of people choke to death every year in the US. Doesn’t sound very intelligent at all, does it?
Biological ‘hacks’ like the epiglottis betray the fact that it is not, after all, intelligently designed. Rather, it’s the result of blind evolution by natural selection.
From Biology to Software
Why am I going on about the digestive system? Because software systems, like biological ones, involve large, complex designs built up from small simple ‘cells’. And so software design too can either be evolved or ‘intelligent’ (*).
If you work in the tech industry then your daily conversations are littered with tech terms. You’ll probably have at least a vague idea of what these mean, but if you’re not in a technical role it’s sometimes hard to put these concepts and buzzwords in precise context.
In this post I’ll briefly explain ten basic terms that engineers use every day. Whatever your role in the tech industry, you’ll benefit from knowing exactly what these mean.
Brevity will require me to leave many important details out. If you’d like me to elaborate further, or if there are other concepts you’d like explained, let me know! I’ll be happy to write another post in this vein in the future.
Anyone with access to the internet now has within reach an incredible array of informal programming education resources, including many specifically targeted at women (a demographic still underrepresented in CS departments). With that, and the availability of simple web application frameworks such as Ruby on Rails, you can learn to create basic but real webapps in just a few weeks. What was once the province of the nerdy few is now available to many.
My previous CS101 post explained what operating systems are, and what services they provide. This post offers a quick tour through some basic operating systems concepts, and explains in more detail how the OS provides certain services.
I’ll describe in turn how the OS manages each of the four basic resources: CPU, disk, memory and network. You’ll probably have heard of some of the concepts before, but not known exactly what they referred to. So now you’ll know!
In a previous post I mentioned that modern computer systems consist of layer upon layer of increasingly complex building blocks. In this post I’ll talk briefly about the most basic of these building blocks: the operating system.
An operating system (OS) is a piece of software that provides a set of common services to all the other software running on a computer. These services primarily involve managing shared resources, notably CPU, disk, memory and network.
Two classes of OS dominate the desktop world: Microsoft Windows, and UNIX-like OSes. UNIX was an OS originally developed over 40 years ago at Bell Labs. It inspired a host of descendants, and its design lives on to this day, including in popular OSes such as Linux, FreeBSD and Mac OS X.
Why are OS services important? For two main reasons: interface and coordination.
A few weeks ago @shanley published this post about the superficiality of what passes for “company culture” in much of Silicon Valley.
In the spirit of the April fool’s day grinch, I’m going to add another one: Culture is not about telling semi-clever lies on your corporate blog once a year.
Every year, come April 1st, dozens of tech companies, from Google down to the smallest startups, post oh-so-hilarious faux press releases about some obviously absurd product move or feature launch. You can imagine the self-satisfied giggles at the brainstorming sessions in the marketing department. The trouble is, no one else is really laughing.
The Adria Richards/PyCon/SendGrid affair has made me sick to my stomach. Any decent human being should be outraged and sad at how low so many people in the wider tech world can sink.
I’m not going to express an opinion on the original incident. How far out of line were those guys? Did Adria overreact? Could she have handled it better? I don’t know, I wasn’t there, and it doesn’t matter any more.
Someone asked me at a bar gathering yesterday: “Is computer science really a hard science? Isn’t it more like engineering?” I had at that point had one too many drinks to give a coherent answer. Plus, nothing kills small talk faster than bringing mathematics into the conversation…
But it is actually a good question. What is the distinction between computer science and software engineering?
Alexia Tsotsis just published an important post on TechCrunch about an insidious side of startup culture, one she refers to as “the cult of success”.
In the startup world we pay lip service to risk all the time. No concept is more hallowed or hyped in Silicon Valley than “entrepreneurship”, and the defining feature of entrepreneurship is risk.
You Can’t Spell Risk Without Failure
Risk, by definition, implies frequent failure. Yet while we recognize the value of failure in principle, when presented with an individual instance of it, one involving a specific startup and actual people, we too often respond with snark and schadenfreude. TechCrunch, and the rest of the tech press, are not immune from this, as Alexia admits. Too often, how the ups and downs of a startup get covered depends more on how close the founders and investors are to the “cool kids” rather than on the merits or the long view.
What’s an appropriate response to failure, then? Should we celebrate it, as we do success?
If you want to revert to the 19th century, you can leave it at that. But a religious political party in Israel are now aiming much further back: United Torah Judaism, a fundamentalist ultra-orthodox party, are running this ad in the campaign for next week’s knesset (parliament) election:
You suffered admirably through my necessary but dense preliminary discussions of boolean logic, binary arithmetic and memory hierarchy. Now comes the payoff - a series of posts about things you’ve actually heard of. First up: software.
I’m sure you have at least a rough idea of what hardware and software are. In fact, if you’re reading this, you probably know a lot of people who write software for a living. But you may be wondering what it means to “write software” or “run a program”. Or you may still marvel at how it is that we can make a pile of electronic circuits into some magical device that can show us pictures of kittens on skateboards. Read on to find out more!
I was saddened to learn of the tragic death, by his own hand, of Aaron Swartz. That a prominent member of my community, our community, the tech community, is gone forever is sad. That he was lost so young is tragic. That he took his own life is a horror.
Aaron Swartz wrote openly about having depression (I don’t like to use “depressed” as an adjective; It’s a disease that you have, not a thing that you are.) And many of the outpourings of grief and support after his death recognized in it a shared experience. The relatability of Aaron’s struggle made the tech world feel even more like a community, as it did after Ilya Zhitomirskiy’s tragic suicide in 2011. For, while depression is dreadful for anyone, it may have a uniquely pernicious effect on highly logical minds.
In my first two CS101 posts (here and here) we discussed the basic electronic circuits used to compute logic conditions and basic arithmetic. At the end of my last post I alluded to a third element we need before we can construct something worth calling a ‘computer’. That element is memory.
Memory gives us the ability to have the current computation be influenced by the result of past computation. It’s what allows us to compose sequences of basic operations to produce ad-hoc, complex computations. These sequences of instructions are called programs.
A non-technical, political post today (I did warn readers there would be some).
The overriding theme of the upcoming presidential election is jobs, with the candidates sparring aggressively over who can create more of them. Unemployment is still high across the country, and finding work, especially if you’re a new college grad, is a daunting challenge.
And yet here in Silicon Valley companies can’t hire fast enough. If you’re a new CS grad you’ll have no trouble at all getting several job offers to choose from. Your starting salary, on day one of your career, might be higher than the current salary of either of your parents. And if you’re an experienced engineer, designer or product manager then the sky is pretty much the limit.
This raises tempting questions: Can the presidential candidates learn from Silicon Valley’s success? Does the tech industry point the way towards a nationwide economic renaissance of job growth? Is what’s good for startups good for America? And if so, do the political endorsements of tech luminaries carry extra weight?
Remember my post about hashing passwords? I recently encountered a tweet exchange between two of my co-workers that used hashing in a novel and nerdy way.
It started with this:
Jorge is posting a quote from Minnesota congresswoman and amateur conspiracy theorist Michelle Bachmann, helpfully providing a link to the source. But he’s also added a mysterious string of letters and numbers in parentheses. Why? Let’s find out.
Google recently published a much-anticipated paper on Spanner, their globally-distributed database. Spanner is the new pinnacle of Google’s technology stack, supplanting Bigtable at the cutting-edge of scalable databases. The authors will present the paper at the OSDI (Operating Systems Design and Implementation) ‘12 conference in October.
Google’s OSDI papers tend to be seminal: the MapReduce paper from OSDI ‘04 sparked the open-source Hadoop project, which in turn powered the industry-wide “big data” trend. And the Bigtable paper from OSDI ‘06 sparked a host of similar systems (HBase, Cassandra, MongoDB and others) that power services such as Facebook, Twitter, Foursquare and many more. Will Spanner spark a trend in the same way? It’s too soon to say, but there are arguments in both directions.
In my debut CS101 post I discussed Boolean algebra, and how any Boolean function can be modeled by a real-life electronic circuit. In this post we’ll explore how this fact provides computers with the ability to do arithmetic.
To do arithmetic, you first need to be able to represent numbers. The most basic numbers are the natural numbers, the whole numbers we use for counting things. To write down a counting number we can simply represent it as a list of ‘things’ of that size. Say we pick the hash symbol # to represent a ‘thing’, then we can write down the counting numbers as #, ##, ###, ####, ##### and so on. Of course we’d also need some symbol, like 0, to represent “no things”.
Several people have suggested that I write a series of “CS101” posts explaining computer science and software engineering fundamentals, unrelated to anything in the news cycle. This idea appealed to my inner didact, so I’ll try a few posts along those lines and see what responses I get. Feel free to comment and/or suggest topics.
Hopefully, reading these posts will provide non-engineers in the tech industry with a better sense of what goes on under the hood, of what their engineering colleagues do, and of why software engineering is such a fascinating, intricate discipline. To any engineers reading this: you’ll have to forgive my simplification of many concepts, in the interest of clarity and brevity.
I’ll kick off, appropriately, by discussing the lowest level concept in Computer Science…
Wired Enterprise recently published an excellent article about how Google revolutionized distributed systems, creating the technologies that power not only Google, but much of the rest of the internet. The article gives some much-deserved limelight to Jeff Dean and Sanjay Ghemawat, accurately describing them as “two of the most important software engineers of the internet age — and two of the most underappreciated”. I might also add, “two of the most unassuming”.
I won’t belabor points already made in that article, but it’s worth emphasizing the key insight that underlies most of that groundbreaking work:
Building Internet-scale systems is the art of constructing a reliable whole from unreliable parts.
These days this may sound obvious to most engineers. But it’s easy to forget that in the first dot-com era, the “big iron” approach was far more common.
If you’re interested in Scala and/or in build systems, Typesafe just published a blog post about Zinc, the standalone incremental Scala compiler. Among other things, it mentions the work I’ve been doing on integrating Zinc into Pants (Twitter’s open-source build system; we’re moving towards using it at foursquare).
I received a rather touching message on Quora (of all places) yesterday. In it a metallurgy engineering student in India writes:
From my childhood I was very much fond of computers but due to some reasons I wasn’t able to get into computer science engineering. Is there any other way I can become a software engineer?
Well, a good start might be to pick up an elementary programming book, or take an online course, or a vocational course, if any are offered where you live. You can learn some of the basics, enough to build a small website or a simple app. There are many entry-level jobs outside the tech industry, e.g., in corporate IT departments, that may be ably performed by a self-taught programmer. Once you have that entry-level job you’ll learn by doing, and your programming skills can rapidly improve to the point where you can build reasonably complex programs.
Business Insider posted this a couple of days ago. Someone took a bunch of tech job postings offering salaries of over $100K and counted out the most popular buzzwords in those postings. By this analysis, the most in-demand tech skills include PowerBuilder, Silverlight and… drumroll… UML(!)
OK, so if you’re not technical these may mean nothing to you, but most tech industry engineers reading this are already chuckling at the thought of anyone using UML. Before we get too snooty though, remember that the world of corporate IT is very different from the tech industry, even though both employ software engineers.
Last week I wrote about algorithms, and in particular about what artificial intelligence (AI) algorithms are and how they differ from other algorithms. I promised to follow up with a post about “strong AI”, so here goes.
Strong AI is the name given to a hypothetical artificial intelligence that meets or exceeds human intelligence. That is, an algorithm that can cause a machine to perform any intellectual task that a human can. Strong AI does not currently exist, and whether it can possibly exist is not just a Computer Science problem but a philosophical one.
The tech press often uses “algorithm” as a synonym for “secret sauce”. The word conjures up images of a mysterious black box that crunches data to produce movie recommendations or search results or a social news feed. Algorithms, particularly those that guess human intent, are often among a company’s most prized intellectual properties. However these magic black boxes are just one type of algorithm. The word encompasses a much more elemental concept.
[Got another TechCrunch guest post today. Reposting here for your convenience.]
No, really, what is Google? TechCrunch co-editor Alexia Tsotsis recently posted an interesting piece about Google’s focus, or rather the perceived lack of it. Google has its fingers in so many pies that there are many different angles from which to consider this question. But the key to most of them ends up being the question of what the company is.
For consumers, Google is, or at least used to be, a search company. The title of Alexia’s post says it all: “Remember When Google Was a Search Engine?” On the other hand, for investors, and cynics, Google is an ad network. That is, after all, where the money comes from.
But as a former Googler and unabashed fan of the company (take this as both full disclosure and a disclaimer), I have a different perspective. For me, Google is, and always has been, a systems company.
[I was privileged to get a guest post on TechCrunch today, which I’m reposting here for your convenience. For the record: I never worked on ad targeting at Google, but some of the smartest people I know did. ]
Microsoft recently announced that it’s taking a huge $6.2 Billion writedown over the failed aQuantive acquisition. This news, and the scrutiny of Facebook’s business model following their IPO drama, show that, in online advertising, it’s all about the targeting.
News of the recent password leak at Yahoo! comes hot on the heels of similar breaches at LinkedIn, eHarmony and last.fm. These leaks are infuriating, not just because these companies got hacked in the first place, but because they failed to adhere to basic password security practices, such as hashing and salting. Although those sound like something delicious you do with potatoes, they are also basic ways of protecting your users’ passwords.
APIs (Application Programming Interfaces) are everywhere these days, allowing services to interact across the web in all sorts of interesting and useful ways. Reader (and ski lease buddy) @jerememonteau suggested that, since it’s such a popular buzzword, I write a post explaining what an API is, exactly. So here we go…
Amazon Web Services are now a crucial part of the startup ecosystem, and if you don’t believe me, wait until the next AWS outage and see which of your favorite online services is left standing…
AWS experienced an outage this morning that took multiple services down, including foursquare.
In this case the problem could have been mitigated by serving out of multiple datacenters, something we’ll definitely be looking into in the future.
But, regardless, this underscores the biggest issue with cloud services: it’s all great when things work, but when problems occur, you’re completely at the mercy of another company. This itself is not unusual - many small businesses have strategic dependencies on other, larger businesses. But what is a problem is the opacity - even when things are working normally, and much more so when they aren’t, there’s very little visibility into what’s really going on in the “cloud”. All you can really do is sit back, wait for Amazon to resolve the problem, and then scramble to bring your systems back online.
Kudos to Twitter, who just announced their new partnership with Girls Who Code, a movement dedicated to eliminating the gender gap in science and engineering by educating and inspiring high school girls, giving them the skills and resources to pursue opportunities in those areas. And a special tip of the hat to Twitter engineer @pandemona who helped make this happen, and has long been a strong advocate for women in engineering.
The issue of diversity in science and engineering is fascinating, and receiving more and more attention recently. But many of the discussions around women in Silicon Valley focus on executives, entrepreneurs and VCs, and less on engineers.
The cliché is that software engineers are overwhelmingly male. And, to be honest, a field trip through Silicon Valley will do little to dispel this stereotype. But why is this so?
Came across this fascinating blog post from online backup company Backblaze (h/t @hoffrocket for the link). If you’re an infrastructure geek you’ll love the detailed description of their custom-built storage pod. But even if you’re not, this chart should catch your eye:
Cloud computing services are all the rage now, with more and more startups forgoing their own hardware build-outs in favor of hosted solutions, most prominently Amazon Web Services.
By now I’m sure most of you saw Saturday’s Google doodle, commemorating Alan Turing’s 100th birthday.
Turing, as you’ve probably either read or already knew, was a British mathematician regarded as the father of computer science. His work as a codebreaker during the second world war contributed substantially to the allied victory. Tragically, not even his invaluable service to his country was enough to save him from persecution for being homosexual, leading to his untimely death at the age of 41.
Turing made many contributions to computer science, but the one that stands out is the concept the doodle illustrated: the Turing machine. A Turing machine isn’t an actual machine, or even a blueprint for one. It’s a mathematical idealization of a computer, conceived by Turing long before real computers existed. The centrality of the Turing machine concept in computer science is why every software engineer you know squealed with delight on seeing that doodle.
I know, I know, I’m a little late to the game… My entry into the world of self-publishing is clearly long overdue. I hope to make up for my tardiness by providing plenty of content worth reading.
For those of you who don’t know me: My name is Benjy Weinberger, and I’m currently the engineering site lead for foursquare's San Francisco office. Before that I worked on Infrastructure and Revenue engineering at Twitter, and before that spent eight years at Google working on Ads and Search engineering. I've had the tremendous good fortune to have worked at some of the best tech companies out there, with some of the best people in this, or any other, industry, engineers and non-engineers alike.