Scaling Event Sourcing for Netflix Downloads, Episode 1

Scaling Event Sourcing for Netflix Downloads, Episode 1

Early in 2016, several Netflix teams were asked the question: “What would it take to allow members to download and view content offline on their mobile devices?”

For the Playback Licensing team, this meant that we needed to provide a content licensing system that would allow a member’s device to store and decrypt the downloaded content for offline viewing. To do this securely would require a new service to handle a complex set of yet-to-be defined business validations, along with a new API for client and server interactions. Further, we determined that this new service needed to be stateful, when all of our existing systems were stateless.

“Great! How long will that take you?”

In late November 2016, nine short months after that initial question was asked, Netflix successfully launched the new downloads feature that allows members to download and play content on their mobile devices. Several product and engineering teams collaborated to design and develop this new feature, which launched simultaneously to all Netflix members around the world.

This series of posts will outline why and how we built a new licensing system to support the Netflix downloads experience. In this first post of the series, we provide an overview of the Netflix downloads project and the changes it meant for the content licensing team at Netflix. Further posts will dive deeper into the solutions we created to meet these requirements.

How Streaming Playback Works

When a member streams content on Netflix, we deliver data to their device from our back-end servers before the member can commence playing the content. This data is retrieved via a complex device-to-server interaction on our Playback Services systems, which can be summarized as follows.

 

To play a title, the member’s device retrieves all the metadata associated with the content. The response object is known as the Playback Context, and includes data such as the image assets for the content and the URLs for streaming the content (see How Netflix Directs 1/3rd of Internet Traffic for an excellent overview of the streaming playback process and systems). The streamed data is encrypted with Digital Rights Management (DRM) technology and must be decrypted before it can be watched. This is done through the process of licensing, where a device can request a license for a particular title, and the license is then used to decrypt the content on that device. In the streaming case, the license is short-lived, and only able to be used once. When the member finishes watching the content, the license is considered consumed and no longer able to be used for playback.

Netflix supports several different DRM technologies to enable licensing for the content. Each of these live in their own microservice, requiring independent scaling and deployment tactics. This licensing tier needs to be as robust and reliable as possible; while many services at Netflix have fallbacks defined to serve a potentially degraded experience in the case of failures or request latency, the licensing services have no fallbacks possible. If licensing goes down, there is no playback. To reduce the risks to availability and resiliency, and to allow for flexible scaling, the licensing services have traditionally been stateless.

And Here Come Downloads…

The downloads flow differs slightly from the streaming one. Similar to the streaming flow, we generate a Playback Context (Metadata) for the downloaded content. Once we have the metadata for the content, we can start the license flow which is depicted as follows:

After checking to ensure a title is available for downloading, the member’s device attempts to acquire a license. We then perform several back-end checks to validate if the member is allowed to download the content. If the various business rules are satisfied, we return a license and any additional information used to play the content offline, and the device can then start downloading the encrypted bytes.

The license used for downloaded content is also different from streaming — it may be persisted on the device for a longer period of time, and can be reused over multiple playback sessions.

Once the title is downloaded to the device, it has a lifecycle as follows:

Every time a member presses play on the downloaded content, the application queues up session events to provide information on the viewing experience, and sends those up the next time the member goes online. After a defined duration for each title however, the original license retrieved with the downloaded content expires. At this point, depending on the content, it may be possible to renew the license, which requires the device to ask the back-end for a renewed license. This renewal is also validated against the business rules for downloads and, if successful, allows the content to continue to be played offline. Once the member deletes the content, the license is securely closed (released) which ensures the content can no longer be played offline.

A Maze of Restrictions

Netflix works with a variety of studio partners around the globe to obtain the best content for our members. The restrictions for downloaded content are generally more complex than for streaming, and far more varied amongst the studios. In addition to requirements related to how long a title can be watched, we have a variety of different caps based on the number of downloads for a device or per member, and potentially limitations on how many times the title can be downloaded or watched during a specified period of time.

We also have internal business restrictions related to download viewing, such as the number of devices on which content can be downloaded.

We must apply all of these restrictions across all the combined movies that a Netflix member downloads. Each time a member downloads a title or wants to extend the viewing time after the initial license expires, we must validate the request against all of the possible restrictions for the partner, taking into account the member’s past download interactions. If a member does not meet any of these requirements, the back-end sends back a response with the reason for why a download request failed.

Beginning of Downloads (and the end of this post)

With the introduction of the downloads feature, we needed to reconsider our approach to maintaining state. The downloads feature requires us to validate if a member should be allowed playback based on previous downloading history. We decided to perform this validation when the license was requested, so we needed a new stateful service that the licensing services could consult for validating business rules. We had a short period of time to design this new stateful system, which would enforce business rules and potentially reject licenses according to a yet-to-be defined set of requirements.

We had an amazing opportunity to create a new service from scratch, with an existing user base of millions, and a limited time to create it in. Exciting times lay ahead!

More, Please!

In the next post of this series, we will discuss the service we created to validate and track downloads usage: an Event-Sourced backed stateful storage microservice. Future posts will deep-dive into the implementation details, including the use of data versioning and snapshotting to provide flexibility and scale.

The team recently presented this topic at QCon New York and you can download the slides and watch the video here. Join us at Velocity New York (October 2–4, 2017) for an even more technical deep dive on the implementation and lessons learned.

The authors are members of the Netflix Playback Licensing team. We pride ourselves on being experts at distributed systems development and operations. And, we’re hiring Senior Software Engineers! Email kcasella@netflix.com or connect on LinkedIn if you are interested.

GraphQL explained

In this post, I’m going to answer one simple question: How does a GraphQL server turn a query into a response?

If you’re new to GraphQL, get the three minute intro in How do I GraphQL? before reading on. That way you’ll get more out of reading this post.

Here’s the ground we’ll cover in this post:

  • GraphQL queries
  • Schema and resolve functions
  • GraphQL execution — step by step

Ready? Let’s jump right in!


GraphQL Queries

GraphQL queries have a very simple structure and are easy to understand. Take this one:

{
  subscribers(publication: "apollo-stack"){
    name
    email
  }
}

It doesn’t take a rocket scientist to figure out that this query would return the names and e-mails of all subscribers of our publication, Building Apollo, if we built an API for it. Here’s what the response would look like:

{
  subscribers: [
    { name: "Jane Doe", email: "jane@doe.com" },
    { name: "John Doe", email: "john@doe.com" },
    ...
  ]
}

Notice how the shape of the response is almost the same as that of the query. The client-side of GraphQL is so easy, it’s practically self-explanatory!

But how about the server? Is it more complicated?

It turns out that GraphQL servers are quite simple, too. After reading this post, you’ll know exactly what’s going on inside a GraphQL server, and you’ll be ready to build your own.


Schema and Resolve Functions

Every GraphQL server has two core parts that determine how it works: a schema and resolve functions.

The schema: The schema is a model of the data that can be fetched through the GraphQL server. It defines what queries clients are allowed to make, what types of data can be fetched from the server, and what the relationships between these types are. For example:

A simple GraphQL schema with three types: Author, Post and Query

In GraphQL schema notation, it looks like this:

type Author {
  id: Int
  name: String
  posts: [Post]
}type Post {
  id: Int
  title: String
  text: String
  author: Author
}type Query {
  getAuthor(id: Int): Author
  getPostsByTitle(titleContains: String): [Post]
}schema {
  query: Query
}

This schema is quite simple: it states that the application has three types — Author, Post and Query. The third type— Query — is just there to mark the entry point into the schema. Every query has to start with one of its fields: getAuthor or getPostsByTitle. You can think of them sort of like REST endpoints, except more powerful.

Author and Post reference each other. You can get from Author to Post through the Author’s “posts” field, and you can get from Post to Author through the Posts’ “author” field.

The schema tells the server what queries clients are allowed to make, and how different types are related, but there is one critical piece of information that it doesn’t contain: where the data for each type comes from!

That’s what resolve functions are for.


Resolve Functions

Resolve functions are like little routers. They specify how the types and fields in the schema are connected to various backends, answering the questions “How do I get the data for Authors?” and “Which backend do I need to call with what arguments to get the data for Posts?”.

GraphQL resolve functions can contain arbitrary code, which means a GraphQL server can to talk to any kind of backend, even other GraphQL servers. For example, the Author type could be stored in a SQL database, while Posts are stored in MongoDB, or even handled by a microservice.

Perhaps the greatest feature of GraphQL is that it hides all of the backend complexity from clients. No matter how many backends your app uses, all the client will see is a single GraphQL endpoint with a simple, self-documenting API for your application.

Here’s an example of two resolve functions:

getAuthor(_, args){
  return sql.raw('SELECT * FROM authors WHERE id = %s', args.id);
}posts(author){
  return request(`https://api.blog.io/by_author/${author.id}`);
}

Of course, you wouldn’t write the query or url directly in a resolve function, you’d put it in a separate module. But you get the picture.


Query execution — step by step

Alright, now that you know about schema and resolve functions, let’s look at the execution of an actual query.

Side note: The code below is for GraphQL-JS, the JavaScript reference implementation of GraphQL, but the execution model is the same in all GraphQL servers I know of.

At the end of this section, you’ll understand how a GraphQL server uses the schema and resolve functions together to execute the query and produce the desired result.

Here’s a query that works with the schema introduced earlier. It fetches an author’s name, all the posts for that author, and the name of the author of each post.

{
  getAuthor(id: 5){
    name
    posts {
      title
      author {
        name # this will be the same as the name above
      }
    }
  }
}

Side note: If you look closely, you will notice that this query fetches the name of the same author twice. I’m just doing that here to illustrate GraphQL while keeping the schema as simple as possible.

Here are the three high-level steps the server takes to respond to the query:

  1. Parse
  2. Validate
  3. Execute

Step 1: Parsing the query

First, the server parses the string and turns it into an AST — an abstract syntax tree. If there are any syntax errors, the server will stop execution and return the syntax error to the client.

Step 2: Validation

A query can be syntactically correct, but still make no sense, just like the following English sentence is syntactically correct, but doesn’t make any sense: “The sand left through the idea”.

The validation stage makes sure that the query is valid given the schema before execution starts. It checks things like:

  • is getAuthor a field of the Query type?
  • does getAuthor accept an argument named id?
  • Are name and posts fields on the type returned by getAuthor?
  • … and many more …

As an application developer, you don’t need to worry about this part, because the GraphQL server does it automatically. Put that in contrast to most RESTful APIs, where it’s up to you — the developer — to make sure that all the parameters are valid.

Step 3: Execution

If validation is passed, the GraphQL server will execute the query.

Every GraphQL query has the shape of a tree — i.e. it is never circular. Execution begins at the root of the query. First, the executor calls the resolve function of the fields at the top level — in this case just getAuthor — with the provided parameters. It waits until all these resolve functions have returned a value, and then proceeds in a cascading fashion down the tree. If a resolve function returns a promise, the executor will wait until that promise is resolved.

That was the one-paragraph description of the execution flow. I think it’s always easier to understand things when they’re illustrated in different ways, so I made a diagram, a table and even a video that walks you through it step by step.

The execution flow in diagram form:

Execution starts at the top. Resolve functions at the same level are executed concurrently.

The execution flow in table form:

3.1: run Query.getAuthor
3.2: run Author.name and Author.posts (for Author returned in 3.1)
3.3: run Post.title and Post.author (for each Post returned in 3.2)
3.4: run Author.name (for each Author returned in 3.3)

The execution flow in text form (with all the details):

Feel free to skip this section if you already understand how it works from the diagram or the table.

Just for convenience, here’s the query again:

{
  getAuthor(id: 5){
    name
    posts {
      title
      author {
        name # this will be the same as the name above
      }
    }
  }
}

In this query, there is only one root field — getAuthor — and there is one parameter — id — with value 5. The getAuthor resolve function will run and return a promise.

getAuthor(_, { id }){
  return DB.Authors.findOne(id);
}// let's assume this returns a promise that then resolves to the
// following object from the database: 
{ id: 5, name: "John Doe" }

The promise is resolved when the database call returns. As soon as that happens, the GraphQL server will take the return value of this resolve function — an object in this case — and pass it to the resolve functions of the name and posts fields on Author, because those are the fields that were requested in the query. The name and posts resolve functions run in parallel:

name(author){
  return author.name;
}posts(author){
  return DB.Posts.getByAuthorId(author.id);
}

The name resolve function is pretty straightforward: it simply returns the name property of the author object that was just passed down from the getAuthor resolve function.

The posts resolve function makes a call to the database and returns a list of post objects:

// list returned by DB.Posts.getByAuthorId(5)
[{
  id: 1,
  title: "Hello World",
  text: "I am here",
  author_id: 5
},{
  id: 2,
  title: "Why am I still up at midnight writing this post?",
  text: "GraphQL's query language is incredibly easy to ...",
  author_id: 5
}]

Note: GraphQL-JS waits for all promises in a list to be resolved/rejected before it calls the next level of resolve functions.

Because the query asks for the title and author fields of each Post, GraphQL then runs four resolve functions in parallel: the title and author for each post.

The title resolve function is trivial again, and the author resolve function is the same as the one for getAuthor, except that it uses the author_id field on post, whereas the getAuthor function used the id argument:

author(post){
  return DB.Authors.findOne(post.author_id);
}

Finally, the GraphQL executor calls the name resolve function of Author again, this time with the author objects returned by the author resolve function of Posts. It runs twice — once for each Post.

And we’re done! All that’s left to do is pass the results up to the root of the query and return the result:

{
  data: {
    getAuthor: {
      name: "John Doe",
      posts: [
        {
          title: "Hello World",
          author: {
            name: "John Doe"
          }
        },{
          title: "Why am I still up at midnight writing this post?",
          author: {
            name: "John Doe"
          }
        }
      ]
    }
  }
}

Note: This example was slightly simplified. A real production GraphQL server would use batching and caching to reduce the number of requests to the backend and avoid making redundant ones — like fetching the same author twice. But that’s a topic for another post!

Conclusion

As you can see, once you dive into it GraphQL is pretty easy to understand! I think that’s pretty remarkable when you take into account how easy it makes things like joins, filtering, argument validation, documentation, which are all hard problems to solve in traditional RESTful APIs.

Of course, there’s a lot more to GraphQL than what I wrote here, but that’s a topic for future posts!

If this got you interested in trying GraphQL for yourself, you should check out our GraphQL server tutorial, or read about using GraphQL on the client together with React + Redux.


2018 Update: Understanding GraphQL execution with Apollo Engine

In the time since Jonas wrote this post, we’ve also built a service, called Apollo Engine, that helps developers understand and monitor what happens in their GraphQL server by providing:

If you’re interested in seeing the execution of your GraphQL queries in real life, you can sign in and instrument your server here. If you’re interested in support running a highly performant, modern application with GraphQL, we can help! Let us know.

Apollo Engine’s operation overview: a heatmap of query service time and request rate/error rate chart.

The Anatomy of a GraphQL Query

GraphQL is just entering the mainstream as a new standard for data fetching. There are now a lot of great conversations happening around developments in the technology and new tools being built every day. One of the best parts of GraphQL is that it gives you a great common language with your team to talk about the data available in your API. But how should you talk about the query language and the core technology itself?

Well, it turns out names for almost every concept in the GraphQL language are right there in the GraphQL specification. But the spec is pretty long, so in this post I’ll lay out some of the most important concepts and terms, with concrete examples, so that you can be an expert in talking about GraphQL.

Note: If you’re trying to learn GraphQL, this isn’t the best place to start. First, read through the concepts on the graphql.org docs, then try using GraphQL with the excellent Learn Apollo tutorial, and finally come back here when you want to go deep into technical language.


Basic GraphQL queries

People commonly call everything that hits your GraphQL API server a “query”. But there are a lot of things mixed in there. What do we call a unit of work we’re asking the server to do? It could be a query, a mutation, or a subscription. The word “request” is pretty coupled to the idea of HTTP and the transport. So let’s start by defining some general concepts:

  • GraphQL document: A string written in the GraphQL language that defines one or more operations and fragments.
  • Operation: A single query, mutation, or subscription that can be interpreted by a GraphQL execution engine.

What are the different parts of a basic operation? Let’s look at a very simple example of a GraphQL document.

A simple query and its parts.

This document shows off the main building blocks of GraphQL, which specify the data you’re trying to fetch.

  • Field: A unit of data you are asking for, which ends up as a field in your JSON response data. Note that they are always called “fields”, regardless of how deep in the query they appear. A field on the root of your operation works the same way as one nested deeper in the query.
  • Arguments: A set of key-value pairs attached to a specific field. These are passed into the server-side execution of this field, and affect how it’s resolved. The arguments can be literal values, as in the query above, or variables, as in the examples below. Note that arguments can appear on any field, even fields nested deep in an operation.

The query above is somewhat of a shorthand form of GraphQL, which lets you define a query operation in a very concise way. But there are three optional parts to a GraphQL operation that aren’t used there. You’ll need these new parts if you want to execute something other than a query, or pass dynamic variables.

Here’s an example that includes all of them:

A more detailed query and its parts.
  • Operation type: This is either querymutation, or subscription. It describes what type of operation you’re trying to do. While all of them look similar in the language, they have slightly different modes of execution on a spec-compliant GraphQL server.
  • Operation name: For debugging and server-side logging reasons, it’s useful to give your queries meaningful names. That way, when you see something wrong going on either in your network logs or your GraphQL server (for example in a tool like Apollo Optics), you can easily find that query in your codebase by name instead of trying to decipher the contents. Think of it like a function name in your favorite programming language.
  • Variable definitions: When you send a query to your GraphQL server, you might have some dynamic parts that change between requests, while the actual query document stays the same. These are the variables of your query. Because GraphQL is statically typed, it can actually validate for you that you are passing in the right variables. This is where you declare the types of variables you are planning to provide.

Variables are passed separately from the query document in a transport-specific way, In today’s GraphQL server implementations, that’s usually JSON. Here’s how a variables object might look for the query above:

An example variables object.

You can see that the key here matches the name of the variable defined in the variable definitions, and the name is one of the members of the Episode enum.

  • Variables: The dictionary of values passed along with a GraphQL operation, that provides dynamic parameters to that operation.

There’s one other core concept that’s not often named, but is important when talking about GraphQL in a technical sense — what’s all that stuff between the brackets called?

The selection set is a concept you’ll see very frequently in the GraphQL specification, and it is what gives GraphQL its recursive nature, allowing you to do nested data fetching.

  • Selection set: A set of fields requested in an operation, or nested within another field. A GraphQL query must contain a selection set on any field that returns an object type, and selection sets are not allowed on fields that return scalar types, such as Int or String.

Fragments

GraphQL becomes even more powerful when you introduce fragments. These bring with them a new set of concepts.

  • Fragment definition: Part of a GraphQL document which defines a GraphQL fragment. This is also sometimes called a named fragment, in contrast to an inline fragment which we’ll get to below.

A fragment definition and its parts.
  • Fragment name: The name of each fragment has to be unique within a GraphQL document. This is the name you use to refer to the fragment in an operation or in other fragments. Fragment names can also be useful for server-side logging, much like operation names, so we recommend using explicit and meaningful names. If you name your fragments well, you can track down which part of your code defines that fragment if you want to optimize your data fetching later.
  • Type condition: GraphQL operations always start at the query, mutation, or subscription type in your schema, but fragments can be used in any selection set. So in order to validate a fragment against your schema in isolation, you need to specify which type it can be used on, and that’s where the type condition comes in.

And, just like operations, fragments have a selection set. These work just like the selection sets in operations.

Using fragments in your operations

Fragments aren’t very useful until you use them in an operation. Fragments can appear in two different ways, as we’ll see below:

Using two different kinds of fragments in a query.
  • Fragment spread: When you use a fragment inside an operation or another fragment, you do so by putting ... followed by the fragment name. This is called a fragment spread, and it can appear in any selection set that matches that named fragment’s type condition.
  • Inline fragment: When you just want to execute some fields depending on the type of a result, but you don’t want to split that out into a separate definition, you can use an inline fragment. This works just like a named fragment, but is written as part of your query. One difference about inline fragments is that they aren’t actually required to have a type condition, and can be used with just a directive, as we’ll see in the example below.

Directives

Directives are a way to get additional functionality out of your GraphQL server. Directives shouldn’t affect the value of the results, but do affect which results come back and perhaps how they are executed. They can appear almost anywhere in a query, but in this article we’ll focus only on the skip and include directives that are in the current GraphQL specification.

You probably wouldn’t usually put all of these in one query, but it’s the easiest way to demonstrate.

Above is a kitchen sink of examples of where you can use the skip and include directives. These directives in particular make GraphQL execution conditionally skip fields and omit them from the response. Because the syntax of directives is quite flexible, they can be used to add more features to GraphQL without making parsing or tooling more complicated.

  • Directive: An annotation on a field, fragment, or operation that affects how it is executed or returned.
  • Directive arguments: These work just like field arguments, but they are handled by the execution engine instead of being passed down to the field resolver.

Get involved in the conversation

A big part of the benefit of GraphQL is having a common language to talk about data fetching. Now, you’ll be well-equipped to participate in deep, technical conversations about GraphQL, for example the ongoing discussion about GraphQL subscriptions.

In this article, we’ve only covered part of the GraphQL specification — the part that deals with the query language. In a future post, we might delve into the various terms used to describe a GraphQL schema.

Want to work on GraphQL technology full-time? We’re hiring for a variety of positions including frontend, backend, and open source!

GraphQL vs. REST

Often, GraphQL is presented as a revolutionary new way to think about APIs. Instead of working with rigid server-defined endpoints, you can send queries to get exactly the data you’re looking for in one request. And it’s true — GraphQL can be transformative when adopted in an organization, enabling frontend and backend teams to collaborate more smoothly than ever before. But in practice, both of these technologies involve sending an HTTP request and receiving some result, and GraphQL has many elements of the REST model built in.

So what’s the real deal on a technical level? What are the similarities and differences between these two API paradigms? My claim by the end of the article is going to be that GraphQL and REST are not so different after all, but that GraphQL has some small changes that make a big difference to the developer experience of building and consuming an API.

So let’s jump right in. We’ll identify some properties of an API, and then discuss how GraphQL and REST handle them.

Resources

The core idea of REST is the resource. Each resource is identified by a URL, and you retrieve that resource by sending a GET request to that URL. You will likely get a JSON response, since that’s what most APIs are using these days. So it looks something like:

GET /books/1{
  "title": "Black Hole Blues",
  "author": { 
    "firstName": "Janna",
    "lastName": "Levin"
  }
  // ... more fields here
}

Note: In the example above, some REST APIs would return “author” as a separate resource.

One thing to note in REST is that the type, or shape, of the resource and the way you fetch that resource are coupled. When you talk about the above in REST documentation, you might refer to it as the “book endpoint”.

GraphQL is quite different in this respect, because in GraphQL these two concepts are completely separate. In your schema, you might have Book and Author types:

type Book {
  id: ID
  title: String
  published: Date
  price: String
  author: Author
}type Author {
  id: ID
  firstName: String
  lastName: String
  books: [Book]
}

Notice that we have described the kinds of data available, but this description doesn’t tell you anything at all about how those objects might be fetched from a client. That’s one core difference between REST and GraphQL — the description of a particular resource is not coupled to the way you retrieve it.

To be able to actually access a particular book or author, we need to create a Query type in our schema:

type Query {
  book(id: ID!): Book
  author(id: ID!): Author
}

Now, we can send a request similar to the REST request above, but with GraphQL this time:

GET /graphql?query={ book(id: "1") { title, author { firstName } } }{
  "title": "Black Hole Blues",
  "author": {
    "firstName": "Janna",
  }
}

Nice, now we’re getting somewhere! We can immediately see a few things about GraphQL that are quite different from REST, even though both can be requested via URL, and both can return the same shape of JSON response.

First of all, we can see that the URL with a GraphQL query specifies the resource we’re asking for and also which fields we care about. Also, rather than the server author deciding for us that the related author resource needs to be included, the consumer of the API decides.

But most importantly, the identities of the resources, the concepts of Books and Authors, are not coupled to the way they are fetched. We could potentially retrieve the same Book through many different types of queries, and with different sets of fields.

Conclusion

We’ve identified some similarities and differences already:

  • Similar: Both have the idea of a resource, and can specify IDs for those resources.
  • Similar: Both can be fetched via an HTTP GET request with a URL.
  • Similar: Both can return JSON data in the request.
  • Different: In REST, the endpoint you call is the identity of that object. In GraphQL, the identity is separate from how you fetch it.
  • Different: In REST, the shape and size of the resource is determined by the server. In GraphQL, the server declares what resources are available, and the client asks for what it needs at the time.

Alright, this was pretty basic if you’ve already used GraphQL and/or REST. If you haven’t used GraphQL before, you can play around with an example similar to the above on Launchpad, a tool for building and exploring GraphQL examples in your browser.

URL Routes vs GraphQL Schema

An API isn’t useful if it isn’t predictable. When you consume an API, you’re usually doing it as part of some program, and that program needs to know what it can call and what it should expect to receive as the result, so that it can operate on that result.

So one of the most important parts of an API is a description of what can be accessed. This is what you’re learning when you read API documentation, and with GraphQL introspection and REST API schema systems like Swagger, this information can be examined programmatically.

In today’s REST APIs, the API is usually described as a list of endpoints:

GET /books/:id
GET /authors/:id
GET /books/:id/comments
POST /books/:id/comments

So you could say that the “shape” of the API is linear — there’s a list of things you can access. When you are retrieving data or saving something, the first question to ask is “which endpoint should I call”?

In GraphQL, as we covered above, you don’t use URLs to identify what is available in the API. Instead, you use a GraphQL schema:

type Query {
  book(id: ID!): Book
  author(id: ID!): Author
}type Mutation {
  addComment(input: AddCommentInput): Comment
}type Book { ... }
type Author { ... }
type Comment { ... }
input AddCommentInput { ... }

There are a few interesting bits here when compared to the REST routes for a similar data set. First, instead of sending a different HTTP verb to the same URL to differentiate a read vs. a write, GraphQL uses a different initial type — Mutation vs. Query. In a GraphQL document, you can select which type of operation you’re sending with a keyword:

query { ... }
mutation { ... }

For all of the details about the query language, read my earlier post, “The Anatomy of a GraphQL Query”.

You can see that the fields on the Query type match up pretty nicely with the REST routes we had above. That’s because this special type is the entry point into our data, so this is the most equivalent concept in GraphQL to an endpoint URL.

The way you get the initial resource from a GraphQL API is quite similar to REST — you pass a name and some parameters — but the main difference is where you can go from there. In GraphQL, you can send a complex query that fetches additional data according to relationships defined in the schema, but in REST you would have to do that via multiple requests, build the related data into the initial response, or include some special parameters in the URL to modify the response.

Conclusion

In REST, the space of accessible data is described as a linear list of endpoints, and in GraphQL it’s a schema with relationships.

  • Similar: The list of endpoints in a REST API is similar to the list of fields on the Query and Mutation types in a GraphQL API. They are both the entry points into the data.
  • Similar: Both have a way to differentiate if an API request is meant to read data or write it.
  • Different: In GraphQL, you can traverse from the entry point to related data, following relationships defined in the schema, in a single request. In REST, you have to call multiple endpoints to fetch related resources.
  • Different: In GraphQL, there’s no difference between the fields on the Query type and the fields on any other type, except that only the query type is accessible at the root of a query. For example, you can have arguments in any field in a query. In REST, there’s no first-class concept of a nested URL.
  • Different: In REST, you specify a write by changing the HTTP verb from GET to something else like POST. In GraphQL, you change a keyword in the query.

Because of the first point in the list of similarities above, people often start referring to fields on the Query type as GraphQL “endpoints” or “queries”. While that’s a reasonable comparison, it can lead to a misleading perception that the Query type works significantly differently from other types, which is not the case.

Route Handlers vs. Resolvers

So what happens when you actually call an API? Well, usually it executes some code on the server that received the request. That code might do a computation, load data from a database, call a different API, or really do anything. The whole idea is you don’t need to know from the outside what it’s doing. But both REST and GraphQL have pretty standard ways for implementing the inside of that API, and it’s useful to compare them to get a sense for how these technologies are different.

In this comparison I’ll use JavaScript code because that’s what I’m most familiar with, but of course you can implement both REST and GraphQL APIs in almost any programming language. I’ll also skip any boilerplate required for setting up the server, since it’s not important to the concepts.

Let’s look at a hello world example with express, a popular API library for Node:

app.get('/hello', function (req, res) {
  res.send('Hello World!')
})

Here you see we’ve created a /hello endpoint that returns the string 'Hello World!'. From this example we can see the lifecycle of an HTTP request in a REST API server:

  1. The server receives the request and retrieves the HTTP verb (GET in this case) and URL path
  2. The API library matches up the verb and path to a function registered by the server code
  3. The function executes once, and returns a result
  4. The API library serializes the result, adds an appropriate response code and headers, and sends it back to the client

GraphQL works in a very similar way, and for the same hello world example it’s virtually identical:

const resolvers = {
  Query: {
    hello: () => {
      return 'Hello world!';
    },
  },
};

As you can see, instead of providing a function for a specific URL, we’re providing a function that matches a particular field on a type, in this case the hello field on the Query type. In GraphQL, this function that implements a field is called a resolver.

To make a request we need a query:

query {
  hello
}

So here’s what happens when our server receives a GraphQL request:

  1. The server receives the request, and retrieves the GraphQL query
  2. The query is traversed, and for each field the appropriate resolver is called. In this case, there’s just one field, hello, and it’s on the Query type
  3. The function is called, and it returns a result
  4. The GraphQL library and server attaches that result to a response that matches the shape of the query

So you would get back:

{ "hello": "Hello, world!" }

But here’s one trick, we can actually call the field twice!

query {
  hello
  secondHello: hello
}

In this case, the same lifecycle happens as above, but since we’ve requested the same field twice using an alias, the hello resolver is actually called twice. This is clearly a contrived example, but the point is that multiple fields can be executed in one request, and the same field can be called multiple times at different points in the query.

This wouldn’t be complete without an example of “nested” resolvers:

{
  Query: {
    author: (root, { id }) => find(authors, { id: id }),
  },
  Author: {
    posts: (author) => filter(posts, { authorId: author.id }),
  },
}

These resolvers would be able to fulfill a query like:

query {
  author(id: 1) {
    firstName
    posts {
      title
    }
  }
}

So even though the set of resolvers is actually flat, because they are attached to various types you can build them up into nested queries. Read more about how GraphQL execution works in the post “GraphQL Explained”.

See a complete example and run different queries to test this out!

An artists’ interpretation of fetching resources with multiple REST roundtrips vs. one GraphQL request

Conclusion

At the end of the day, both REST and GraphQL APIs are just fancy ways to call functions over a network. If you’re familiar with building a REST API, implementing a GraphQL API won’t feel too different. But GraphQL has a big leg up because it lets you call several related functions without multiple roundtrips.

  • Similar: Endpoints in REST and fields in GraphQL both end up calling functions on the server.
  • Similar: Both REST and GraphQL usually rely on frameworks and libraries to handle the nitty-gritty networking boilerplate.
  • Different: In REST, each request usually calls exactly one route handler function. In GraphQL, one query can call many resolvers to construct a nested response with multiple resources.
  • Different: In REST, you construct the shape of the response yourself. In GraphQL, the shape of the response is built up by the GraphQL execution library to match the shape of the query.

Essentially, you can think of GraphQL as a system for calling many nested endpoints in one request. Almost like a multiplexed REST.


What does this all mean?

There are a lot of things we didn’t have space to get into in this particular post. For example, object identification, hypermedia, or caching. Perhaps that will be a topic for a later time. But I hope you agree that when you take a look at the basics, REST and GraphQL are working with fundamentally similar concepts.

I think some of the differences are in GraphQL’s favor. In particular, I think it’s really cool that you can implement your API as a set of small resolver functions, and then have the ability to send a complex query that retrieves multiple resources at once in a predictable way. This saves the API implementer from having to create multiple endpoints with specific shapes, and enables the API consumer to avoid fetching extra data they don’t need.

On the other hand, GraphQL doesn’t have as many tools and integrations as REST yet. For example, you can’t cache GraphQL results using HTTP caching as easily as you can REST results. But the community is working hard on better tools and infrastructure. For example, you can cache GraphQL results in your frontend using Apollo Client and Relay, and more recently also on the server with Apollo Engine.

Got any more ideas about comparisons between REST and GraphQL? Please post them in the comments!

带你到新疆去

两年前,我们就计划要去新疆看一看,没有去成是因为当时小孩还小。后来我跟小雪都喜欢来自新疆的一位说唱歌手叫艾热,他有一首歌这样唱:我要回新疆去,带你到新疆去。我们对新疆有好感,多半也是因为艾热。宁皓网的同学里,也有来自新疆的,有一位也叫艾热:)

新疆太远了,我们都不太了解它,但好像每个人又对这个地方都有自己的一个映像。至少会想到很多关键词,天山,雪莲,乌鲁木齐,哈密瓜,葡萄干,吐鲁番,阿凡提。脑子里的画面可能是一片沙漠,加上一片葡萄园,维族大叔唱着歌,跳着舞。

我开始意识到新疆是个美丽的地方,是从一个德国小伙子那里听说的,他说我们家乡很美,但没有像新疆那种壮观的景像。我决定有机会也要去看看。随着新疆开始推广自己的旅游业,会有越来越多的人到这个地方,了解这个地方。

小孩子明年就要上小学了,我跟小雪商量一下,要出去走走,就去新疆吧。小雪其实有点担心,我其实也没什么谱,因为太远了。整天放着那首艾热的,“我要回新疆去,带你到新疆去”,小雪开始浏览了一些新疆的游记,可能也是看到新疆实在太美了,看着看着就把行程定下了。

旅程

我只知道她订好要去新疆这个地方,见到她在地图上画了一个小圈,具体去哪里,我完全不知道。小雪说有两种方式,一种是小车团,就是几个人坐同一辆车,有司机,按规定路线走。还有一种就是租车自驾。我对自己的驾驶技术最自信的地方就是保证开的慢,10 年驾龄,最远长途开过两个半小时,从济南到德州。所以你能想到我的驾驶技术有多么地熊。

听小雪说还有一段路要不停地盘山,让我有点胆怯。不过想了想,如果跟陌生人同坐一辆车,一定有些不便的地方。小雪胆子小,还严重晕车,司机开的太猛,她要么吓死,要么晕死。所以就硬着头皮,选择自驾,也想尝试一点跟之前不一样的旅行方式。其实她对我的驾驶技术也没啥信心,跟我确定了好几次。

我们决定先坐火车,先到乌鲁木齐,然后在那里租一辆车。坐火车便宜些,坐飞机,来回一万块,坐火车的话两千就够了。 从山东济南出发到新疆乌鲁木齐,一路西行,正常的话需要 38 小时,不过我们遇到张掖地震(我在睡觉并不知道有地震),晚点 4 个半小时,用了 40 几个小时抵达乌鲁木齐,这也破了我们坐火车时间最长的记录,我想以后去哪里坐火车都不会再觉得时间长了。

走过新疆才知道什么叫大,直线不拐弯十几公里都挺正常,一路美景永远都拍不完。我们也只走了新疆北方的一小部分,南疆会有不一样的风景,要真正感受新疆的人文还要去南疆。北疆的风景美的不像话,除了没有海,在这里你几乎能看到想看的所有自然风光,湖泊,河流,小溪,森林,草原,雪山 …

火车单程 3521 公里,自驾里程 2900 公里,此次旅程约等于 1 万公里。整个旅程我没感觉一丝疲惫,这让我有点意外,或许是因为被一路的美景不断地刺激着,也是小雪安排得当,开几百公里就会安排一个中转休息的地方。她很怕我累着,在巴音布鲁克,有一小段路要徒步,她非得要背背包,结果回到住宿的地方的时候累吐了。

分享一个小状况供大伙嘲笑,开了两天的车我就感冒了,嗓子疼,听力下降。我出门还有个毛病,就是大号排泄系统会出问题,拉不出来。在喀纳斯的一大早,我用内力与意念勉强逼出两条,用力过猛导致痔疮复发。开车的时候,上面嗓子痒痒,一咳嗽,下面屁眼一使劲,就又会疼一下,这就是连锁反应得完美解释。

乌鲁木齐

9/15 – 9/17:济南 — 乌鲁木齐)乌鲁木齐跟济南大概有两小时时差,9 月份早上 8 点以后日出,晚上 8 点以后日落。这里白天 10 点上班,晚上 8 点下班。正常的话是中午 12 点到,因为火车晚点,所以是下午到的,先找了一间旅馆,放下行李。找了附近的一个餐馆,吃了一个椒麻鸡。乌鲁木齐的自来水像是放了冰糖。

乌伦古湖

9/18:乌鲁木齐火车站 — 乌伦古湖,自驾 645 公里)一早到租车的地方领了一辆他们那里最便宜的车,一辆白色的别克英郎,它已遍体鳞伤,因为是旅游旺季,租车行没来得急修理。我并没有太在乎,后来也证明任何车辆都可以自驾新疆,路保养的都非常好。整个行程,车子没有出什么大状况,有一个小状况,就是底盘右前方的一块护板脱落下来,被一位在那拉堤的细心的旅馆老板及时发现,他帮我们找来一段电线将脱落的护板缠住。大家以后租车,趴下检察一下车子底盘。

坐上车子,我说“出发”,“老婆,我们要去哪里?”,小雪:“去北屯”,我:“好来 ~ 出发!”,我们其实去的是一个湖边的度假村。小雪也不怎么了解那个湖,她只知道要在北屯休息一下,然后第二天再去另一个地方,所以她就找了北屯附近的一个渡假村。

快到的时候,已经要日落了(下午 8 点),我们从主道转向一个看起来有点凄凉的支道上,一路上坡,转过最后一个弯道下坡,被眼前的景像震到了,一片巨大的湖泊就像银色闪电,这就是乌伦古湖。湖旁边是个小村子,里面有一些毡房,还有一些两层高的房子。

喀纳斯

9/19 – 9/20:乌伦古湖 — 喀纳斯,自驾 270 公里)在乌伦古湖边住了一天,第二天一早,出发去喀纳斯,大部分是平地,快到喀纳斯有一段山路,山下是一大片草原,有小溪,牛马在吃草,有几处毡房。秋天,大部分的草原都已经变成黄色,不过这处草原仍然是绿的,非常地美。

下午 3 点左右抵达喀纳斯。私家车不能直接开到喀纳斯景区里面,一般会把车停在景区门口(贾登峪),这里有加油站。停下车要买票,然后坐喀纳斯的区间车进入景区,大约 50 分钟左右。

喀纳斯里面很大,区间车会把你从贾登峪拉到一个换乘中心,在这里可以继续免费坐他们的公交车到各个地方,可以到不同的观景的地方,住的地方。小雪在喀纳斯新村预定了一间小木屋。我觉得在这可以多住几天,然后徒步到不同的地方,会非常有意思,千万不要错过日出日落。

禾木

9/20 – 9/21:喀纳斯 — 禾木,自驾 67 公里)先从喀纳斯的换乘中心坐到贾登峪,然后开自己的车再到禾木。禾木是喀纳斯景区的一部分,不过也是有一段路程,几乎都是山路,走的是乡道,路上极美。

中转

(9/21 – 9/22:禾木 — 布尔津 — 精河县) 禾木到布尔津自驾 178 公里,布尔津到精河县 656 公里。

那拉提

9/23 – 9/24:精河县 — 那拉提,自驾 487 公里)那拉提是一片非常可爱的草原。

巴音布鲁克

9/24 – 9/25,那拉提 — 巴音布鲁克,自驾 85 公里)从那拉提到巴音布鲁克会走一段独库公路。巴音布鲁克是一个巨大的在天山脚下的高山草原,秋天草地大部分已经黄了,如果是夏天来到这里,看到的将是一片绿色海洋。

独库公路

9/25,巴音布鲁克 — 乌鲁木齐,自驾 509 公里)从巴音布鲁克走独库公路回到乌鲁木齐。独库公路指的是独子山到库车之间的一段公路,全长 561 公里,连接了南北疆。这条公路 83 年就已经建好了,耗时 10 年,牺牲 168 人。走这条路,心怀敬意,一是敬畏自然,二是感谢所有参与这条公路建设的人。路过乔尔玛烈士陵园的时候,大家下车鞠个躬,行个礼。如果说一辈子要自驾走一条公路,独库公路是第一选择。

会友

9/25 – 9/26,乌鲁木齐),小雪的一位好朋友在乌鲁木齐做生意,她们约好,当天回到乌鲁木齐的晚上一起吃了一顿饭,第二天我们又一起逛了逛大巴扎,在丝路有约吃了个下午饭。

她给我们讲了一个故事,有一次开车出去跑业务,开到伊犁附近,走错了路,上了一座山,开始盘山往上走,走到半山腰往下一看,太美了,秋天,叶子有了颜色,一潭清水旁有几个毡房,被树包围着。她继续往山上开,遇到了一些徒步的人,快要开到山顶的时候,忽然下雪了,周围已经是冬天的景像,她开始有点害怕了,路有点难走了,打开了暖风,继续翻过了这座山,雪停了,也慢慢开始缓和了,秋天的景像又出现了。她至今也不知道那座山的名字。

回家

9/26 – 9/28,乌鲁木齐 — 济南)从乌鲁木齐坐火车回济南。

迁徙的人

每个迁徙过的人的背后都有个故事。

在新疆的一个小镇上,我们住在一间家庭旅馆,老板娘也不是本地人,是从湖南来的。当时老板饭娘来的时候自己带着两个女儿,她并没有说为什么要搬到这个镇上,能感觉到非常不容易,说着说着,她眼圈红了。最近她的一个女儿要出嫁了,她特地从乌鲁木齐买了一身旗袍。

在那拉提,国道旁边,有一个小村子,我们住在村民自己开的小旅馆,在自住房子旁边盖了十几间客房。有个小院子,一进门是自己家住的地方,小院子对面就是客房。旅馆应该是这家人的儿子经营的,妈妈也会帮忙。小伙子出生在新疆,但他的爷爷是重庆人。

乌伦古湖旁边的一个小村子,有一个餐馆是一家人开的,他们是贵阳人,零四年决定全家搬到这个湖边。我们是闻着糖醋鱼的味道找到的这家店,一家人都在一间简易房的厨房里忙着,客人在各自的毡房里用餐。这家人的女儿出来招呼我们,她从一间自用的用作仓库的毡房里拿出一本菜单,把我们带到另一间毡房。

后来又来了几位客人,她不太好意思地问我们能不能到刚才的仓库里用餐,我犹豫了一下,开了一天的车,也有点饿了,就同意了。这里比较有名的是冷水鱼,我们点了一个豆腐炖狗鱼,还有一个素菜,要了几碗米饭。客人比较多,所以等的时间挺长的。老板,老板娘,还有她们的女儿来来回回到这间仓库里取食材,酒水,打开水。这里的自来水像是放了冰糖一样。​

小羽被一只在水桶里的螃蟹吸引了,她又害怕,又好奇这只螃蟹是否还活着,时不时用脚踢踢水桶。“爸爸,爸爸,这只螃蟹还活着”。这时餐馆女儿进来了,她说这只螃蟹是昨晚逃出来的,其它的螃蟹都被吃掉了。她又指着桌上的一袋酸奶,“娃娃饿了吧,先喝一袋酸奶,还多得很。”

我们的饭菜都准备好以后,餐馆就没有那么忙了,老板娘也进到仓库里,手里拿着一个缸子,里面有饭菜。一位客人进来要结账,老板娘不让他结,好像是有其他人嘱咐过,不能让别人结账。那人走出去以后,又进来一个人,跟之前那人应该是同桌吃饭的,看样子跟老板娘很熟悉,老板娘跟他说了一下今天的菜单,也没人让那人结账,说晚会发到他的微信上。

那人看到老板娘缸子里的饭菜,“晚上不要吃太多,太油。”,老板娘:“我这一天都还没有吃饭。”,老板娘转头跟我们说那人是她的医生朋友,她自己有白血病,已经四年了,说要好好活着。那人走出以后,我们跟老板娘继续聊了几句,我好奇地问她,为什么要从贵阳搬到这个湖边的小村子。她只说她们一家都是打鱼的。那间仓库又让我想起我们家刚刚从东北搬到山东以后,爸妈开的那间餐馆。

结语

新疆的安检非常严格,重要的地方都设有检查站,你要把四个车窗落下,摘下帽子与眼镜。加油的时候,随车人员必须全部下车,只有司机能进入加油站,你在门口要刷一次身份证,并接受安检,在加油机加油的时候要再刷一次身份证 。如果在城市,进入商店也要接受安检。

城市、小镇、乡村都覆盖了 4G 信号。自驾路上,大部分地区都有 4G 信号,偶尔会没有,可能是电信没信号,也可能是移动没信号。一般开上十来分钟就会有信号了,所以基本不用担心网络问题。提前下载导航的离线地图,即使没有网络信号,也可以继续为你导航。

新疆的路都很好,不用担心车子的通过问题。不过要注意独库公路,每年只开放几个月,而且一但下雪或者遇到地质灾害会马上封路。

如果想多看到一点绿色,可以 8 月去。