22

We're looking into using GraphQL for version 2 of a headless CMS we're developing.

In version 1 of this CMS, we used JSON Schema to validate each document against a schema before being saved in the database -- for example, if it's a blog article it'd be validated against the Article schema, and if it's a roundup ("best of" list) it'd be validated against the Roundup schema.

For version 2, we're contemplating using GraphQL for the API. And then it occurred to us that the GraphQL schema is basically parallel to the JSON Schema -- it describes the document structure, field types, etc.

So we could simply have "one source of schema truth", the GraphQL schema, and use this both for querying documents and for validating new documents when a new revision is being saved. (Note that I'm talking about validating JSON data against a GraphQL schema, not validating a GraphQL query against a schema.)

I figure the data would be validated against all the fields in the schema, except deprecated fields, because you only want to validate against the "latest version" of the fields.

We could do one of three things:

  1. Use the GraphQL AST directly to validate a document, i.e., write a data validator ourselves.
  2. Use the GraphQL AST to generate a JSON Schema, and use a standard JSON Schema validator to actually validate it.
  3. Just accept that GraphQL isn't quite the right fit for validation, and define the schema twice -- once in GraphQL for querying, and again in JSON Schema for validation (annoying and error-prone to keep them in sync).

Questions: Are #1 and #2 silly ideas? Are there any GraphQL tools which do this kind of data validation? Are there any other ways to achieve this without defining the schema twice?

For reference, our backend will be written in Python but the admin UI will be client-side React and JavaScript. This is a cut-down version of the kind of GraphQL schema we're talking about (supports "Article" and "Roundup" document types):

schema {
    query: Query
}

type Query {
    documents: [Document!]!
    document(id: Int): Document!
}

interface Document {
    id: Int!
    title: String!
}

type Article implements Document {
    id: Int!
    title: String!
    featured: Boolean!
    sections: [ArticleSection!]!
}

union ArticleSection = TextSection | PhotoSection | VideoSection

type TextSection {
    content: String!
    heading: String
}

type PhotoSection {
    sourceUrl: String!
    linkUrl: String
    caption: String
    content: String
}

type VideoSection {
    url: String!
}

type Roundup implements Document {
    id: Int!
    title: String!
    isAward: Boolean!
    intro: String
    hotels: [RoundupHotel!]!
}

type RoundupHotel {
    url: String!
    photoUrl: String @deprecated(reason: "photoUrl is deprecated; use photos")
    photos: [RoundupPhoto!]!
    blurb: String!
    title: String
}

type RoundupPhoto {
    url: String!
    caption: String
}
Ben Hoyt
  • 10,694
  • 5
  • 60
  • 84
  • 1
    Do you know of https://github.com/jakubfiala/graphql-json-schema ? I tried it out with your graphql schema and the basics look fine to me. https://runkit.com/fdlk/59baf17d01ac700012e110b4 The devil is probably in the details. – flup Sep 14 '17 at 21:15
  • is there a reason you want to use GraphQL? It seems like you will be losing a lot since you have actual schema validation. –  Sep 20 '17 at 14:05
  • Hi there, just came across your question and in our company, we would like to use GraphQL to schema/validate our JSON product, did you find any solutions in the end? – zenoh Jul 24 '18 at 08:30
  • @ben-hoyt what did you ent up doing? – Adam Arold Sep 24 '21 at 10:43
  • @AdamArold Hah, you'll laugh, but we used Wordpress and the problem was no more. :-) The company ended up going in quite a different direction and not building their own CMS. – Ben Hoyt Sep 25 '21 at 19:42
  • Well, if it works it works, right? – Adam Arold Oct 01 '21 at 13:57

1 Answers1

2

Level of certainty in evolving situation

GraphQL is still an evolving technology (as it says right at the top of the spec document) so it's safe to say there are no truly "correct" answers for this.

Generalities

InputObject types ("input" in Interface Definition Language terms) together with lists ("[]" in IDL terms) together with the various scalars seem to completely cover what you can do in JSON.

If the Python implementation of GraphQL conforms with the spec, then supplying data as either GraphQL literals or (better) as "variables" should provide everything that a custom validation could: GraphQL validation will do the right thing.

Recommendation for your situation

Based on my work with GraphQL so far, my suggestion is to "go with the grain". If your GraphQL schema conforms with what your data architecture requires, just use normal GraphQL validation. If you do make your own validation, it should come after GraphQL has done its normal data-shape checking.

To summarise the above points, and to answer your question with a question: what's wrong with letting GraphQL in its normal functioning do the validation heavy lifting?

Ed.
  • 1,992
  • 1
  • 13
  • 30
  • Thanks. Having looked into this a bit further, I think the main problem is that input types can't be or contain unions. Our schema definitely contains unions, so we can't use it as an input type for a mutation (without hacks). See also https://github.com/facebook/graphql/issues/114 – Ben Hoyt Sep 19 '17 at 13:25
  • Am I wrong that you could have one input query (be it a `query` or a `mutation`) per input type? (You'd generate those automatically from your existing input unions) Or are you saying your requester wouldn't know which type it was? If that's the case, you could have one generic "dispatching" query, whose output was a union type? – Ed. Sep 19 '17 at 17:04
  • I'm not entirely sure what you're suggesting: do you mean have `updateArticle()` and `updateRoundup()` mutations? Yes, you could do that. Inner unions are still a problem, such as `ArticleSection` above -- you can hack around that with an `ArticleSectionInput` type with nullable fields for each type in the union, eg: `textSection`, `photoSection` and `videoSection` ... but it's pretty messy. – Ben Hoyt Sep 19 '17 at 20:43
  • I am saying to have `updateArticle` and `updateRoundup`. You'd also need `createTextSection`, `createPhotoSection`, etc. Since it's input, you'd know what you were making there'd be no need for polymorphism. Given that GraphQL dictates a separation already between input and output objects, there is going to be a level of repetition whatever you do. – Ed. Sep 21 '17 at 00:47