Hacker News new | past | comments | ask | show | jobs | submit login

I wrote the gist; happy to answer questions (although out and on my phone right now so I won't be super fast).



A common problem in a service oriented architecture is needed to minimize the amount of round trips to a particular service.

Example: To render a UI view, I need to fetch a list of friends, a list of posts, and a list of notifications. All 3 of these data types (friends, posts, notifications) have users associated with them. Instead of fetching users for the 3 data types 3 separate times, I need to flatten that into a single call. I end up with something like this:

  var [posts, friends, notifs] = yield [
    getPosts(),
    getFriends(),
    getNotifs()
  ]; // executes in parallel

  var userIDs = new Set();
  userIDs.addAll(posts.map(p => p.userID));
  userIDs.addAll(friends.map(f => f.userID));
  userIDs.addAll(notifs.map(n => n.userID));

  var users = yield getUsers(userIDs.toArray());

  return {posts, friends, notifs, users};
You can see where this gets cumbersome. I have to compose one of these for every kind UI view at the root level. On the one hand it's very explicit and the flow of data is very clear, but it also means relationships between data are dupicated in more than one place (often many places).

Could GraphQL help with that scenario?


(I've used GraphQL at Facebook, but I don't work on it.)

If I understand your post correctly, I think it could. GraphQL prefers to expose full objects, rather than IDs, which lets you query for data inside objects at the same time. So for your example above, you would only need one GraphQL query:

  viewer() {
    posts {
      node {
        author { id, name, favorite_color },
        // any other post data you want
      }
    },
    friends {
      node {
        id,
        name,
        favorite_color,
      }
    },
    notifications {
      node {
        source { id, name, favorite_color },
        // any other notification fields you want
      }
    },
  }
The "id, name, favorite_color" is just a sample set of fields, you could replace it with whichever your components needed. Relay also has functionality to avoid duplicating the set of fields, which is especially useful when you want to add or remove a field from what a reusable component needs.


What language is that?


That's ES6 JavaScript (running on Node 0.11.15). We've already migrated most of our production code to this and we're loving it!

If you like that kind of control flow, check out the library we wrote to do it: https://github.com/storehouse/engen


Why not use let instead of var?


Only habit. We're so used to var. But yea, need to transition to let for most things.


I think migrating to const makes more sense. In practice only a few things require to be mutated.


Didn't realize const was in. Will definitely be using it.


I believe that's Javascript (ES6).


Great work! Relay is really exciting.

My concern is regarding the server-side implementation of the GraphQL endpoint. My understanding is that GraphQL endpoints couple tightly to a graph-oriented backend. Facebook already has TAO so it's a no-brainer, but how feasible do you think a normal SQL-based backend can adapt to efficiently process arbitrary GraphQL queries? Or would it be easier to switch to a graph-oriented database (e.g. neo4j) instead? The former option seems to be quite an engineering endeavor, while the latter is just too risky right now.


Not a Facebooker so I haven't worked with GraphQL per say (sad face), but a quick-and-dirty way to do this type of deep data fetching on an SQL database is to fetch one layer at a time: do a breadth-first search on the parsed GraphQL, and at the end of each layer you should have enough data in API-server memory to know the IDs of everything in the subsequent layer. Rails has largely switched to doing this rather than deep joins. Given that the depth is based on how many layers deep your GraphQL is, rather than the number of records, you avoid the N+1 problem without needing to switch to a graph database, and there's a relatively constant number of DB hops required. The main caveat I see is that cursor-based pagination is hard to emulate in SQL without persistent database connections... not sure if there's a scalable way to do this while keeping the GraphQL endpoint stateless.

Having experimented with neo4j a year ago, I would concur that it hasn't been battle-tested for use as a high-availability database for web apps; it was used much more for offline analytics AFAIK. Thingdom did some REALLY cool things using neo4j from JS, and their ideas for modeling news feeds were really mindblowing, but I'm hesitant to put anything mission-critical on it just yet from both a technical-debt perspective and a speed/scalability perspective.


GraphQL itself doesn't make any assumptions about the storage engine being graph-oriented or otherwise. It's true that the grammar makes it easy to express graph-like relationships (eg. using the notion of one-to-many connections), and from the perspective of the application it can request an arbitrarily complex hierarchy of nested objects, but how and where the underlying data gets fetched is implementation specific and I can imagine adapting GraphQL to any number of different data source types.


Why not just let components subscribe to specific events?

In our framework, it's mostly event based, so if some data changes, events are fired and then things happen based on subscriptions. Just wondering what advantages are offered here over that setup.


Because then it wouldn't be React. React tries to solve the conceptual spaghetti created by heavily evented systems, by putting everything into one stream of notifications so that you can immediately know where things are being changed and by what. It prefers synchronous over asynchronous for this reason also.


Sorry, what? Are you saying basically that event subscriptions are all declared in some declarative language, making them all easy to find "in one place"?


Could you give an example of the type of bug you're referring to here?

> This means that only the fields of an object that a component explicitly asks for will be accessible to that component, even if other fields are known and cached in the store (because another component requested them). We call this masking, and it makes it impossible for implicit data dependency bugs to exist latently in the system.


Component requests n fields of data, actually uses n+1 fields and the only reason the field is populated is because another component has requested the data, remove that component and boom unrelated component breaks. With GraphQL you have to specify exactly what fields you want and you can't accidentally get access to more than requested.


What tool do you use in sever side to parse graphql?


We have our own parser at Facebook. Our reference implementation will include a parser, and a we'll be releasing a spec as well so it should be possible to build alternative implementations as well, should you wish.


Is there a connection between Relay and Haxl?


no.


Code example?


https://www.youtube.com/watch?v=9sc8Pyc51uU - there are some examples in the talk from React JS Conf.

We haven't yet finalized what API we will provide when we open source Relay. You'll certainly get one object, called Relay, however ;)


I'm not watching a 30 minute video to get an idea of what writing code in this framework will look like. Can you really not give a single code example, even if the API is in a fluid state?


Application code will look like regular React components. The only difference is you declare what data you want in a GraphQL fragment.

Here's a photo from the presentation that shows an example: https://twitter.com/devonbl/status/560532680513556481.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: