API Evangelist API Evangelist
API Learnings
Toolbox
API Evangelist LLC

Fundamental Spectral Rules For Governing Your JSON Schema

November 21, 2024 · Kin Lane
Fundamental Spectral Rules For Governing Your JSON Schema

I can’t articulate the importance of enterprise organizations getting their schema house in order as a starting point for API governance. There is an exponential amount of instability and friction introduced across the enterprise because teams are properly using JSON Schema. To augment my services for helping enterprise organizations manage their JSON Schema via GitHub, I am providing customers with a base set of Spectral rules to begin defining the maturity of the schema you are publishing to your Git repository.

Identifier

Every schema added to the central GitHub repository must have a unique identifier applied to each of the schema, giving each object a source of truth no matter where it might find itself across the enterprise.

  json-schema-2020-12-id-error:
    description: Schema MUST have a unique identifier for each object.
    message: Schema MUST Have a $id.
    severity: error
    given: $
    then:
      field: "$id"
      function: truthy
  json-schema-2020-12-id-info:
    description: Schema MUST have a unique identifier for each object.
    message: Schema Has an $id.
    severity: info
    given: $
    then:
      field: "$id"
      function: falsy 
  json-schema-2020-12-id-source-url-error:
    description: The id for a schema MUST have a valid URL pointing to the central register.
    message: The $id for schema MUST reference the central registry.
    severity: error
    given: $
    then:
      field: "$id"
      function: pattern
      functionOptions:
        match: \b(example.com)\b
  json-schema-2020-12-id-source-url-info:
    description: The id for a schema MUST have a valid URL pointing to the central register.
    message: The $id for schema references the central registry.
    severity: info
    given: $
    then:
      field: "$id"
      function: pattern
      functionOptions:
        notMatch: \b(example.com)\b

I will add more rules and regexes to get more explicit with the naming conventions applied as part of the unique identifier, but this is a damn good place to start when it comes to governing the source of truth for your schema.

Schema

Next up, every schema should have a $schema property present, and be using the latest draft of the JSON Schema specification—so very important. This work augments my existing policy and rule work across OpenAPI and other artifacts in use.

  json-schema-2020-12-schema-error:
    description: Schema MUST have a $schema property.
    message: Schema MUST Have a $schema.
    severity: error
    given: $
    then:
      field: "$schema"
      function: truthy
  json-schema-2020-12-schema-info:
    description: Schema MUST have a $schema property.
    message: Schema Has a $schema.
    severity: info
    given: $
    then:
      field: "$schema"
      function: falsy  
  json-schema-2020-12-schema-draft-error:
    description: The $schema for a schema MUST use the latest draft
    message: The $schema for schema MUST use the latest draft.
    severity: error
    given: $
    then:
      field: "$schema"
      function: pattern
      functionOptions:
        match: 'https://json-schema.org/draft/2020-12/schema'
  json-schema-2020-12-schema-draft-info:
    description: The $schema for a schema MUST use the latest draft
    message: The $schema for schema uses the latest draft.
    severity: info
    given: $
    then:
      field: "$schema"
      function: pattern
      functionOptions:
        notMatch: 'https://json-schema.org/draft/2020-12/schema'

This set of rules will have one of the biggest impacts across the enterprise in helping stabilize things, but is one that will take a shit ton of work to implement because you are going to have to update a lot of libraries and tooling across the enterprise.

Title

Now I want to make sure every object has a name, but also that it is applied as the title of the object. This is something we’ll have to align with the unique identifier naming patterns, but having the title property present and meeting minimum governance is critical.

  json-schema-2020-12-title-error:
    description: Schema MUST have a title for the entire object, describing an object in plain language.
    message: Schema MUST Have a Title.
    severity: error
    given: $
    then:
      field: title
      function: truthy
  json-schema-2020-12-title-info:
    description: Schema MUST have a title for the entire object, describing an object in plain language.
    message: Schemas Has a Title.
    severity: info
    given: $
    then:
      field: title
      function: falsy
  json-schema-2020-12-title-pascal-case-error:
    description: Schema names should always be PascalCase, and be used in title for a schema to help ensure readability and consistency.
    message: Schema Title MUST Be PascalCase.
    severity: error
    given: $
    then:
      - field: title
        function: pattern
        functionOptions:
          match: ^[A-Z](([a-z]+[A-Z]?)*)$
      - field: title
        function: pattern
        functionOptions:
          match: ^[A-Z](([a-z0-9]+[A-Z]?)*)$
  json-schema-2020-12-title-pascal-case-info:
    description: Schema names should always be PascalCase, and be used in title for a schema to help ensure readability and consistency.
    message: Schema Title Are PascalCase.
    severity: info
    given: $
    then:
      - field: title
        function: pattern
        functionOptions:
          notMatch: ^[A-Z](([a-z]+[A-Z]?)*)$
      - field: title
        function: pattern
        functionOptions:
          notMatch: ^[A-Z](([a-z0-9]+[A-Z]?)*)$
  json-schema-2020-12-title-length-error:
    description: Schema names and resulting title should be kept to less than 25 characters.
    message: Schema Names MUST Be Less Than 25 Characters
    severity: error
    given: $
    then:
      field: title
      function: length
      functionOptions:
        max: 25       
  json-schema-2020-12-title-length-error:
    description: Schema names and resulting title should be kept to greater than 3 characters.
    message: Schema Names MUST Be Greater Than 3 Characters
    severity: error
    given: $
    then:
      field: title
      function: length
      functionOptions:
        min: 3                 

You will want to adjust the minx and max values to match the needs of your organizations and it is a fun one to bike shed with teams producing schema, but these rules provide a great place to begin when doing this important work.

Description

Now let’s apply similar logic to the description for each object, which is absent on most any schema I struggle across, so just making sure there is a description, along with a minimum bar for length and other things is a good place to start.

  json-schema-2020-12-description-error:
    description: Schema MUST have a description for the entire object, explaining in plain language what the object is for.
    message: Schema MUST Have a Description.
    severity: error
    given: $
    then:
      field: description
      function: truthy
  json-schema-2020-12-description-info:
    description: Schema MUST have a description for the entire object, explaining in plain language what the object is for.
    message: Schemas Has a Description.
    severity: info
    given: $
    then:
      field: description
      function: falsy
  json-schema-2020-12-description-length-error:
    description: The description for a schema should not be too long, helping keep it as readable and consumable as possible by users.
    message: Schema Description MUST be Less Than 250 Characters
    severity: error
    given: $
    then:
      field: description
      function: length
      functionOptions:
        max: 250

Once I get the base in place I will also move towards applying some sort of vocabulary that will trigger rules when certain forbidden words are used, and identify things that are better expressed as JSON Schema properties, allowing us to get more precise with governance.

Types

Next, let’s establish a baseline for object types by requiring there is a type property at the top level, making sure we are always explicit with any JSON Schema we’ve published, ensuring the fundamentals are always present.

  json-schema-2020-12-type-error:
    description: All schema must have a type property.
    message: Schema MUST Have Type Property
    severity: error
    given: $
    then:
      field: type
      function: truthy
  json-schema-2020-12-type-info:
    description: All schema must have a type property.
    message: Schema Have Type Property
    severity: info
    given: $
    then:
      field: type
      function: falsy        

I don’t want to be opinionated at the type level yet, and am looking to get feedback for what else should be present here when it comes to requiring an object or array, focusing on just the fundamentals right now.

Property Names

Moving on to the properties of each object, we always want to be making sure our casing of object properties are consistent across the enterprise, making sure objects use consistent naming and have at least one property present.

  json-schema-2020-12-properties-names-camel-case-error:
    description: All schema properties should be camel case for consistency.
    message: Schema Property Names MUST Be camelCase.
    severity: error
    given: $.properties
    then:
      - field: "@key"
        function: pattern
        functionOptions:
          notMatch: ^[A-Z][a-z0-9]*[A-Z0-9][a-z0-9]+[A-Za-z0-9]*$
  json-schema-2020-12-properties-names-camel-case-info:
    description: All schema properties should be camel case for consistency.
    message: Schema Property Names Are camelCase.
    severity: info
    given: $.properties
    then:
      - field: "@key"
        function: pattern
        functionOptions:
          match: ^[A-Z][a-z0-9]*[A-Z0-9][a-z0-9]+[A-Za-z0-9]*$
  json-schema-2020-12-properties-names-max-error:
    description: It makes sense to keep schema property names a consistent length.
    message: Schema Properties Name Length MUST Be Less Than 25
    severity: error
    given: $.properties
    then:
      field: "@key"
      function: length
      functionOptions:
        max: 25
  json-schema-2020-12-properties-names-min-error:
    description: It makes sense to keep schema property names a consistent length.
    message: Schema Properties Name Length MUST Be More Than 3
    severity: error
    given: $.properties
    then:
      field: "@key"
      function: length
      functionOptions:
        min: 3        

There is a lot to think about when it comes to the property object depending on how you use more advanced JSON Schema elements, but this is meant to be a starting place for anyone looking to govern their schema and covers the examples available on the JSON Schema site.

Property Descriptions

All properties should have descriptions, keeping them not too short and not too long, and similar object descriptions, you will want to get more precise with a vocabulary and other ways of fine tuning how teams are describing properties.

  json-schema-2020-12-properties-descriptions-error:
    description: Schema property descriptions should be complete and useful to users.
    message: Schema Properties MUST Have Description
    severity: error
    given: $.properties.*
    then:
      field: description
      function: truthy
  json-schema-2020-12-properties-descriptions-info:
    description: Schema property descriptions should be complete and useful to users.
    message: Schema Properties Have Description
    severity: info
    given: $.properties.*
    then:
      field: description
      function: falsy
  json-schema-2020-12-properties-descriptions-length-error:
    description: Schema property descriptions should not be too long, helping ensure they aren't too verbose and complex.
    message: Schema Properties Description MUST Be Less Than 250 Characters
    severity: error
    given: $.properties.*
    then:
      field: description
      function: length
      functionOptions:
        max: 250
  json-schema-2020-12-properties-descriptions-length-error:
    description: Schema property descriptions should not be too short, helping ensure they are actually helpful.
    message: Schema Properties Description MUST Be More Than 10 Characters.
    severity: error
    given: $.properties.*
    then:
      field: description
      function: length
      functionOptions:
        min: 10 

Ideally this work is not being done by developers and is being handled by a central set of architects and content writers, because developers are going to always write the shortest and least helpful descriptions for properties.

Types

Similar to object type, we want to establish the baseline for property types, but I am also refraining from going deep down this rabbit hole yet, and will get more advanced here with other types, regexes, and approaches to standardizing vocabulary at this layer.

  json-schema-2020-12-type-error:
    description: All schema properties must have a type property.
    message: Schema MUST Have Type Property
    severity: error
    given: $.properties.*
    then:
      field: type
      function: truthy
  json-schema-2020-12-type-info:
    description: All schema properties must have a type property.
    message: Schema Have Type Property
    severity: info
    given: $.properties.*
    then:
      field: type
      function: falsy               

It will take some research harvesting common JSON Schema in use out there to understand how people are using property types — I know we can go a long way based upon the spec, but the real value of governance comes in aligning it to the enterprise.

Properties Required

I threw required in here as well, but I am skeptical that this is a fundamentals based upon how more advanced JSON schema elements are used, but what the hell — I am open to push back on whether this should be part of the basics, as well as properties, based upon $def and other ways of using.

  json-schema-2020-12-required-error:
    description: All schema should have a required property.
    message: Schema MUST Have a Required Property for Objects.
    severity: error
    given: $
    then:
      field: required
      function: truthy
  json-schema-2020-12-required-info:
    description: All schema should have a required property.
    message: Schema Have a Required Property for Objects.
    severity: info
    given: $
    then:
      field: required
      function: falsy
  json-schema-2020-12-required-error:
    description: All schema should have at least one required property.
    message: MUST Be At Least One Tag
    given: $
    severity: error
    then:
      field: required
      function: length
      functionOptions:
        min: 1

What is required can get pretty freaky, so I am skeptical of the governance around this, but will harvest the JSON Schema from common API providers to see how they are expressing things, then come up with some basics based upon what I see.

Just the Fundamentals

It was hard for me to stop there, but beginning with the fundamentals while keeping simple is so important. I was talking to Ben from JSON Schema this morning and told him I’d share my base set, and I want to do some more marketing and storytelling around my base JSON Schema management services via GitHub repositories, so this starter Spectral ruleset satisfies these needs. While there are plenty of folks governing JSON Schema with Spectral via OpenAPI, there aren’t many folks who are governing the maturity of their JSON Schema by themselves using Spectral rules.

I will bundle these rules up with the CI/CD pipelines and other automation I am applying to the schema I am managing via GItHub repositories. All schema published to a central schema registration should pass all these rules. Another thing to note with these rules is that I strive to always have positive and negative for each pattern I am governing, which is not something others are doing, but I feel is critical to the behavioral aspect of API governance across teams. If you want to get at the rules I’ve published to a GitHub Gist, but will also include via the API Commons rules gallery when I have more time.