What the Heck Is Hyperscale?

I first heard the term hyperscale around 2013 when I was managing Google’s data center hard drive software team. As hard drive and SATA controller vendors came in to share their road maps, they started referring to the needs of hyperscalers as something different from those of traditional enterprise users. Google’s needs being different from traditional enterprise wasn’t new but the term hyperscale to refer to Google and others was. Google didn’t use this term internally. As far as I know, Amazon and Facebook didn’t either. Our needs didn’t always overlap either. What did the industry mean by calling us hyperscale?

Seven years later, I’m still wondering what hyperscale means. Research papers and marketing materials talk about hyperscalers without ever clarifying what it means. Who qualifies as a hyperscaler? What are the criteria? I set out to find a definition that answers these questions and, finding little, created my own.

Looking for an origin story

With how broadly used the term is, I naively assumed that someone must have defined it somewhere yet web searches turned up nothing useful. After a few false starts, I discovered that I could filter Google search results with a date range. This was still far from ideal as many results would show very old dates despite the contents being obviously more up to date. Regardless, I could find candidate results this way and then check the Internet Archive’s Wayback Machine to check that the content at the date claimed by Google.

Using that method, the earliest use of hyperscale (as related to computing) I’ve been able to find is “Creating a Hyper-efficient Hyper-scale Data Center” in Dell Power Solutions magazine’s February 2008 issue. While the bulk of the article is pitching Dell’s newly formed Data Center Solutions Division, the introduction gives an overview of how cloud and large cluster computing environments differ from traditional environments. The article highlights how these environments focus on maximizing efficiency at every level of data center design from machines to power and cooling infrastructure. While this is generally true, the article places the emphasis on solutions used by these environments–rather than when those solutions are appropriate. Does simply choosing to trim unnecessary components from servers and using a hot aisle make you a hyperscaler? I’m inclined to say no. I’m also a bit dubious that Dell understood hyperscale well enough at the time to speak authoritatively. The same article describes these cloud computing environments as containing thousands, or even tens of thousands, of servers. By 2008, Google and Amazon had deployed well over a hundred thousand servers.

Crowd sourcing a definition

With my search for an origin story failing to uncover a definition, I began to wonder if our collective usage of the term hyperscale would uncover some consistent traits that could be formed into a definition. Knowing that my Twitter followers tend to skew toward servers and hyperscale, I posed the question there:

I anticipated diverse responses especially as retweets began to elicit responses from a broader audience. Depsite my anticipation, I was dumbfounded by the breadth of attributes and decisiveness in responses. Here are a few responses to give a flavor:

Again, these definitions mostly focused on quantitative attributes, often in the number of servers, or apply specific solutions to problems perceived to be only experienced at hyperscaler. I was intriguted by the the suggestions related to ratios. This spoke to me of some underlying competiting needs problem. Maybe we could define hyperscale in terms of the constraints rather than the outcomes. That led me to thinking about times that I’ve heard electrical engineers discuss the term “high-speed signal.”

Inspiration from electrical engineering

I’ve spent a lot of time around electrical engineers, especially those designing computer motherboards. At various times, the topic of what consitutes a high-speed signal has come up. When asked for a description or definition, electrical engineers will give a variety of common answers depending on their experience. Students and interns often suggest it has to do with the frequency of the signal. Junior engineers will often talk about setup and hold times. While these are all relevant to the concept, they are not complete definitions.

Prof. Chris Diorio takes a different approach in his CSE467 handout on high-speed signaling. First, he describes the foundational abstractions of digital design:

  • Digital interpretation of analog values
  • Logic devices as idealized Boolean primitives
  • Steady-state abstraction
  • Finite-state behavior of sequential systems

High-speed is then defined as the point where those abstractions break down as a consequence of circuit speed. This avoids identifying a specific speed (which will vary with materials and logic technologies), set of problems, or possible solutions. Instead, it’s an observation that there is an inflection point related to speed where the set of problems that need to be considered changes.

Prof. Diorio continues by showing various ways in which those abstractions might break down as speed increases. For each of these potential problems, common solutions are also available. So what does it mean to have a high-speed signal? Something is broken and now you need to understand what and why so you can figure out which solutions makes sense for your situation.

Defining hyperscale

Borrowing Prof. Diorio’s approach, what is the foundational abstraction of business computing? Businesses existed before computing yet computers have been adopted by nearly every business which implies that they must provide value greater than their costs. Extendeing that line of thinking from a business/economics perspective:

  • An IT investment must bring a potential revenue increase or opex reduction greater than its anticipated total cost of operation
  • Total IT investments must not detract from the primary business focus

Applying Prof. Diorio’s approach, hyperscale is then an inflection point where those rules break down as a consequence of scale of IT deployments. Past that inflection point (aka operating at hyperscale), the needs of the business cannot be met through straightforward purchasing of additional servers and using mainstream administration techniques and tools. Exactly where this point is will vary from business to business and continue to change over time. This implies that at any point in time many more companies are hyperscalers than is commonly believed.

While the exact problems faced by a hyperscale business will also vary, common problems and solutions to those problems have emerged:

  • High opex due to # admins per # servers => treat servers as cattle instead of pets, heavy use of automation
  • High capex due to off-the-shelf equipment => white box and defeatured equipment
  • High opex due to high PUE => improve power and thermal efficiency through whole-building co-design

What seems to differentiate hyperscalers is not their scale, but how they have adapted to the new challenges they face. Most striking is that the most prominent hyperscaler businesses today universally have software development as a core competency intrinsicly linked to their primary business focus. They have been able to justify an overall larger IT investment by leveraging the their existing software development teams to create software tools that lets them not only operate at hyperscale but thrive there. That is a topic worthy of its own post.