Consumer-Driven Contracts: A Service Evolution Pattern

ConsumerDrivenContracts

About

Reading up on Pact for contract testing between API providers and consumers led me to a whitepaper by Ian Robinson from ThoughtWorks.

Cliffnotes

CDC (Consumer-Driven Contracts, or Derived Contracts) pattern for evolving API contracts as an alternative to schema extension (versioning) or “just enough” validation.

  • SOA
    • High-value business logic in discrete, reusable, encapsulated, testable, modifiable, connected services
    • Fully realise if services can evolve independently
    • Paradoxically, APIs couple providers and consumers rather than decouple (i.e. providers scared to make changes)
  • Existing strategies:
    • Update API: best for providers (simple change), worst for consumers (potentially all consumers must re-implement)
    • Extend API maintaining backwards compatibility: worst for providers (more complexity, less elegant, harder maintain), best for consumers (no change). Really big changes still require providers and consumers to jump at the same time.
  • W3C Technical Architecture Group (TAG) has proposed versioning strategies from none (no versioning) to big bang (abort if any changes), backwards/forwards-compatible 
  • Provider contracts
    • Term introduced meaning a complete set of business functions (i.e. APIs)
    • Stable and immutable for a set period of time!!
  • Consumer contracts
    • Consumers send providers expectations of what specific data is being used based of Schematron assertions (i.e. incomplete schemas validating only what the consumer actually uses)
  •  CDC:
    • CDC container consists of schemas, interfaces (endpoint definitions and possibly links), conversations (provider state!!), policy (usage of data) and quality of service (availability, latency and throughput) for a provider contract
    • Allows providers to track and analyse how services are being used in order to better meet customer demands
    • Adds complexity and protocol-dependence
    • Doesn’t protect from breaking changes or reduce coupling, but rather adds visibility
  • Misc:
    1. When developing a consumer, do “just enough” validation to get the data you need in accordance with the Robustness Principle (i.e. liberal attitude towards incoming data)
    2. Would be nice for consumer responses to possibly include data fields they want in the future (i.e. feature requests)

Conclusion

CDC violates REST (stateful), relies on immutable periods of time (fragility), adds too much complexity and overhead. A better standard for receiving feedback from consumers would be an optional (yet standardised) header in the request, which the provider could track.

Resources

API-Driven Development (ADD)

About

ADD is a fast, flexible, robust, reusable, testable, scalable architecture design for (new and existing) applications.

Unlike traditional MVC (Model View Controller) frameworks, there is no model (i.e. data/representation) tightly coupling the view (e.g. webpage, app) to the controller (i.e. services, business logic). Instead, with the proliferation of technologies/platforms (e.g. browser, mobile, TVs), you will have many interfaces dealing with a single back-end.

Shifting business logic and data to web services follows the Semantic Web movement, which many believe to be the natural evolution of the internet.

Note: API refers to best practice RESTful web services, not SOAP!

Advantages of APIs

  • Decoupled (separation of business logic and presentation)
  • Stateless (less complex, scalable)
  • Simple (view/test in browser)
  • Reusable (one back-end API consumable by web, phone, ps3, etc.)
  • Exposed (open, mashups/marketing, scrutiny/improvement)
  • Facade (complexity hidden)
  • Universal/Agnostic (communication between systems using potentially different technologies)
  • Secure (HTTPS, OAuth)
  • Versionined
  • Cachable (browser or server)
  • Defined and documented (API Blueprint, RAML or Swagger)

Disadvantages

  • Overhead (HTTP requests)
  • Vulnerable (exposed services, mitigated by proper design and security)

ADD Architecture & Advantages

  • Easy to mock
    • API consumers (e.g. HTML page) can interact with a mock API response, such as a URL to a static JSON file on Dropbox or Mockjax plugin for jQuery
  • Easy to test
    • Isolate and test just the business logic by calling an endpoint and validating the JSON returned
  • Technology agnostic
    • Underlying API and front-end technologies do not matter — use the best tool for the job
  • Modular
    • Break services into self-contained modules for greater robustness (smaller systems are less fragile) and selective scalability
  • Firewalled
    • All requests can be routed through a ‘Gatekeeper’ module that handles authorisation and authentication
    • Calls can be monitored and rate-limited to mitigate abuse
    • Can be added to the stack later with very little integration effort
  • Daisy chain
    • Provided endpoints are persistent and versioned, it’s possible to chain API calls together (i.e. functional programming)
  • Work to your strengths
    • Back-end people work exclusively with data, persistence, scalability and business logic. Front-end work exclusively with user experience, visuals, accessibility and JSON.

Approaches

The front-end (e.g. HTML, iPhone app) can be developed completely independently of the back-end (APIs) business logic. Approaches:

  1. Meet in the Middle: Both sides agree on the initial API specification and build them independently — with the front-end initially mocking the requests/responses. Requires flexibility from both sides via constant adjustments.
  2. Ground Up: Front-end developed running on mocked data, which is revised as needed. The final mock data is then filled-in by real APIs. Does not take into consideration requirements from other back-end systems.
  3. Top Down: API specification is created that (hopefully) exposes everything the front-end will need. Late discovery requirements at the front-end may be discovered.

All options are good, but #1 is recommended because it promotes RAD (Rapid Application Development) through low starting barriers, constant iteration and working in parallel.

The initial API specification document requires planning of the fundamental model and behaviours — and will be constantly revised during development. Don’t try and anticipate and build future endpoints now.

Scatter Loading

mockup

Recommended approach for new web applications, referred to as AJAX Scatter Loading:

  1. Back-end services as RESTful APIs returning JSON
    1. Modular and stateless for maximum scalability
  2. Authentication via a ‘Gatekeeper’ server using OAuth
    1. Proxy for all requests
    2. Returns a session token for future OAuth requests
    3. Stateless for scalability
  3. Front-end landing page
    1. Lightweight, generic, unsecured, uncached (to allow CSS/Javascript updates) skeleton HTML
    2. Server calls API to get OAuth session token, used to sign protected API calls
      1. Store the token on page (i.e. Javascript variable) or browser (e.g. cookie) to ensure front-end servers are stateless
      2. Tokens expire and may be associated to an IP
    3. Page populated by a series of AJAX calls to APIs

Alternatives?

Each blue area is a pagelet streamed to the page

  • Facebook BigPipe
    • Endpoints return JSON containing HTML/Javascript/CSS, called pagelets
    • Use of Chunked Encoding and Flush-tranfer to make a single HTTP request (skeleton) with pagelets streamed and rendered in sequence thereafter
    • Reduced Facebook latency by half
    • Extra complexity
    • Does not support templating or caching
    • Harder to debug
    • Violates decoupling by putting HTML in JSON
    • Minor performance gains for normal web applications (possibly worse performance if lots of caching possible)
  • WebSockets may replace REST in the future due to overhead/performance gains
    • Current lack of browser support and replacement standard

BigPipe is Scatter Loading geared for enormous scale meets maximum performance and is overkill for normal businesses, but it was the inspiration for my interest in ADD…

References

RESTful Web Services, Part 3: The Perfect Framework

Introduction

Before I start investigating REST frameworks I went through the mental exercise of conceptualising the perfect REST framework.

It soon became clear to behave seamlessly it relies heavily on other frameworks — authorisation, authentication, persistence and validation. With all the other pieces in play, it would be possible to transparently expose domain objects (i.e. database objects) and their CRUD (Create, Read, Update, Delete) operations as RESTful web services.

First, let’s define what the framework should and should not be responsible for!

Not Responsible

  • Transparent persistence layer, where all CRUD operations are automagically available for domain objects
  • Validation handled by the persistence layer (e.g. typecasting, field not null)
  • Authentication (e.g. OAuth)
    • Security (e.g. IP restrictions, mandatory HTTPS)
  • Authorisation
    • Data authorisation should occur as close to the data layer without touching the database — the persistence layer
      • Enforces consistent security to all services further up the stack (e.g. web application, API, messaging queue)
      • Filtered database queries for efficiency (e.g. list all WHERE id = X)
      • No need to ‘rebuild the wheel’ at every layer
      • A call to the DAO layer from a user without permission would result in a AccessDeniedError
    • Permissions (defined in database)
      • Access to resource X (e.g. access to Users)
      • Access to resource operation X (e.g. access to create Users)
      • Access to resource data X (e.g. access to the User password field)
      • Access from origin Y (e.g. access from API only)
  • Advanced searching capabilities (out of scope, provide overriding implementation)

Responsible

  • All domain CRUD operations would be automatically exposed as RESTful resources
  • Automatic serialisation of representations (JSON, XML, RSS, Atom, YAML) based on extension (/users.xml returns XML, default JSON)
  • Automatic deserialisation of PUT and POST data to objects (e.g. create dog object XML data)
  • Hypermedia links (HATOES) returned with representations
  • Automatic pagination of “read” lists based on query parameters (e.g. page=3&limit=10)
  • Automatic return field filtering based on query parameters (e.g. fields=foo,bar)
  • All resources v1 unless otherwise noted
  • Basic search of a domain’s properties (e.g. name, title)
  • Nested domain objects result in associations (e.g. User object with Dog object = /users/123/dogs)
  • Proper status codes and detailed error messages
  • Ability to manually expose other methods as web services if required (e.g. retrofitting legacy code, custom service methods such as “generateLaunchUrl”)
    • Project automatically scanned for markup, no configuration
    • Methods: @Get, @Post, @Put, @Delete
    • Mapping: @Path(“/foo/*/{id})
    • Parameters: Should be automatically mapped by name (e.g. {userId} maps to long userId)
    • Versioning: @Version(from=”2″, to=”4″)
    • May require custom business logic for authorisation
  • Maybe: Automatically generated web interfaces

Workflow

  1. Create domain objects (e.g. Users and Dogs)
    1. Database schema created
    2. CRUD operations created
  2. Markup objects with validation (e.g. names are mandatory and less than 100 characters)
  3. Define any authentication (e.g. OAuth consumer.id based on user.id)
  4. Define any authorisation rules (e.g. only access objects where consumer.id = user.id, certain users read only, if user.role = ‘admin’ can )
  5. A custom service layer method, getLink, is marked up for a GET request for the path /link
  6. At this point the RESTful web service is ready to use!
Example URL
Description
https://api.example.com/v1/users.xml As an admin returns XML of all users with hypermedia.
https://api.example.com/v1/users?page=2&limit=5 As an admin returns JSON of users 6-10 with hypermedia.
https://api.example.com/v1/users/123 Returns a single user based on authorisation.
https://api.example.com/v1/users/123?return=email Returns the email of a single user.
https://api.example.com/v1/users/123/dogs Returns that user’s dogs.
https://api.example.com/v1/link Returns a generated URL.
https://api.example.com/v1/users (POST) If permission, sent user data is validated against the domain object markup. If no permission, a 403 is returned.

Conclusion

The perfect system would reduce projects to a network of decoupled API calls. Mindless tasks such as CRUD, serialisation and hypermedia would be taken care of.

I would approach a hypothetical Java implementation in the following way:

  • No RAD (Rapid Application Development) tools such as Spring Roo (hard to trace, test, modify, easy to break)
  • Domain POJOs annotated with JSR-303 annotations (room for improvement, another post…)
  • Domain POJOs annotated with JPA annotations (room for improvement, another post…)
  • Domain POJOs extend a Resource class which extends GenericDao, using generics to provide CRUD operations
  • Define authorisation rules, which are stored in the database (origin, target class, target method, returned data). Examples:
    • Users can only access their own data (e.g. user.id = currentUser.id and user.role = ‘user’)
    • Users cannot create other users, but managers can
    • Managers can access their own data plus those under them (e.g. parent = user.id and user.role = ‘manager’)
    • Administrators can access all data for a company (e.g. company.id = user.company.id and user.role = ‘admin’)
    • Really complex permissions can be handled in custom classes (akin to custom validation)
  • On project start, all Resource classes are scanned and CRUD operations exposed as RESTful services based on authorisation rules
  • Serialisation and parameter mapping done through the magic of reflection

Obviously some other languages would be better suited and less rigid than Java…

RESTful Web Services, Part 2: Specification

Naming Conventions

Verbs are bad, nouns are good, plurals are best.

Resource POST GET PUT DELETE
/dogs Create new dog List dogs Bulk update dogs Delete all dogs
/dogs/foo ERROR Show foo Update foo if exists, otherwise ERROR Delete foo
  • Subdomain:
    • api.example.com
  • Pagination:
    • /dogs?limit=20&page=10
  • Versioning (backwards incompatible changes only):
    • /v1/dogs
    • /v2/dogs
  • Complex queries:
    • /dogs?colour=red&size=small
  • Global search:
    • /search?q=foo+bar
  • Scoped search:
    • /owner/123/dogs/search?q=foo+bar
  • Return fields:
    • /dogs?fields=name,colour,location
  • Return format:
    • JSON (Default): /dogs
    • XML: /dogs.xml
    • YAML: /dogs.yaml
  • Associations:
    • /owners/dave/dogs (returns all dogs owned by Dave)

Security

Mix and match security according to your needs. An IP restricted, OAuth authenticated resource call over HTTPS will provide a very high level of security.

  • Transport: HTTPS encryption (optional or mandatory)
  • Authentication: 2-Legged OAuth 1.0a (widely adopted open standard)
  • Authentication: IP restriction

Encoding

  • Path variables cannot be URL encoded and therefore should not contain only alphanumeric characters
  • Query parameters to be URL encoded

HTTP Status Codes

Leverage the HTTP protocol by returning relevant status codes that can be reliably interpreted.

Code Description Error Message Example
200 OK Request was successful No GET https://www.example.com/v1/users/jsmith returns the jsmith user object
401 Unauthorized Request could not be authenticated. The consumer should take steps to verify they have correct public/private keys and the request is being hashed correctly. No GET https://www.example.com/v1/users/jsmith is accessed directly through the browser (and not an OAuth client).
403 Forbidden The server understood the request (no problem with authorisation), but is refusing to fulfill it. The request can either be modified or re-attempted once the underlying data has been changed (i.e. add a user will work after the existing user has been removed).A returned 403 HTTP Status Code will contain a serialised error message containing a “type” (e.g. ResourceNotFoundException) and human-readable “message” element. Yes POST https://www.example.com/v1/users/jsmith without the required parameters (i.e. username) or the username already exists
404 Not Found An error occurred because the service could not be found. The request should not be repeated. No GET https://www.example.com/v1/user does not exist (missing ‘s’ at the end).
500 Internal Server Error The request was valid but an error occurred with the provider. The request can be re-attempted once the provider addresses the issue. Yes GET https://www.example.com/users/jsmith cannot be processed because the database is unavailable!

Documentation

Clear and comprehensive documentation is vital for any successful web service.

  • List endpoint variables with colons (e.g. /dogs/:name)
  • Security details
  • List all conventions, resources, status codes, and optional query modifiers
  • Up-to-date
  • Publicly available
  • Example requests and responses
  • XML Schema Definition (XSD) for XML responses
  • JSON Schema for JSON (not widely supported)

RESTful Web Services, Part 1: Introduction

About

Web services provide a common ground for web applications to communicate. Exposing your web applications encourages use (e.g. FourSquare) and fosters creativity (e.g. Google Maps mash-ups). The W3C recognises two typesof web services: REST-complaint and arbitrary (e.g. SOAP).

REST

REpresentational State Transfer (REST) is a style of software architecture (not implementation) that defines behaviour between a client and server — most famously adopted by (but not limited to) the web.

Representations (i.e. actual information) are transferred by resources (i.e. sources of information, endpoints) identified by global identifiers (e.g. web address) over a standardised interface (e.g. HTTP), which passes through network components and connectors (e.g. servers). Key goals of REST are performance, scalability, simplicitymodifiabilityvisibility, portability, and reliability. To be considered RESTful the following architecture constraints must be met:

  1. Decoupled client and server
  2. Stateless
  3. Cacheable (implicitly or explicitly specified by the response)
  4. Layered (client unaware of intermediary servers e.g. load balancer, firewall, proxy)
  5. Optional code on demand (extend client functionality with executable code e.g. JavaScript to a browser)
  6. Generic interface:
    • Resources are identified on request and return the corresponding representation
    • Existing protocol features are adopted (e.g. HTTP response codes)
    • Responses contain the information on how they are processed (e.g. MIME types, cacheability)
    • Responses contain the links to modify the resource on the server (e.g. URI to delete), provided it has permission to do so
    • Responses contain the links for further states (i.e. other resources). A client cannot assume any resources outside the initial entry point.

REST principles applied to web services implemented over HTTP must:

  1. Contain a base URI (e.g. http://api.example.com/v1/resources/)
  2. Support MIME types (e.g. JSON, XML, YAML)
  3. Support operations using HTTP methods (i.e. PUT, POST, GET, DELETE)
  4. Hypertext driven (i.e. linking to other resources), also referred to as HATEOS (Hypertext As The Engine of Application State).
HATEOS (Hypermedia As The Engine Of Application State = returned hyperlinks) pros and cons:
  • Flexibility to change resource URLs +1
  • Extensible (i.e. future proof) +1
  • No documentation required +1
  • If documentation is provided, it is subject to change without notice -1
  • More responsibility and complexity for API consumers to discover URLs, which makes it slower to harder to adopt — especially if OAuth is involved. Also does not respect Tesler’s Law -1
  • Less efficient because clients must make the same calls to discover the same URLs (although caching helps) -1
  • In reality, the web is RESTful but people often bookmark pages to avoid re-discovering URLs -1
Many web services claim to be RESTful, but do not implement all the REST constraints (almost always ignoring HATEOS). Such web services can be considered RESTish and I’m going to go ‘out on a limb’ and estimate that 95% of all claimed services are not RESTful. This makes REST creator, Roy Fielding, want to yell at you.
Examples:
{
    home:{
        users:{
            a:{
                href:'https://api.example.com/v1/users'
            }
        },
        dogs:{
            a:{
                href:'https://api.example.com/v1/dogs'
            }
        }
    }
}

https://api.example.com/v1/users

{
    users:{
        id:'jsmith',
        name:'John Smith',
        dogs:{
            dog:{
                id:'fido',
                name:'Fido',
                a:{
                    href:'https://api.example.com/v1/users/jsmith/dogs/fido'
                }
            }
        }
    }
}

https://api.example.com/v1/users/jsmith/dogs/fido

{
    dog:{
        id:'fido',
        name:'Fido'
    }
}

SOAP

Simple Object Access Protocol (SOAP) is an XML-based protocol that reduces HTTP (and other protocols) to the role of simple transport mechanism.

The protocol consists of an envelope (what the message is and how to process it), encoding rules (datatypes), and a convention for representing procedure calls and responses. SOAP consists of several layers (e.g. message exchange patterns) and was developed by Microsoft as a successor to XML-RPC.

The key goals of SOAP are extensibility (i.e. extensions), neutrality (any transport protocol), and independencefrom any programming model.

Examples:
Request
POST /Users HTTP/1.1
Host: api.example.com
Content-Type: text/xml; charset="utf-8"
Content-Length: nnnn
SOAPAction: "Get-Users"

<SOAP-ENV:Envelope
  xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
  SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
   <SOAP-ENV:Body>
       <m:GetUsers xmlns:m="Get-Users">
           <id>jsmith</id>
       </m:GetUsers>
   </SOAP-ENV:Body>
</SOAP-ENV:Envelope>
Response
HTTP/1.1 200 OK
Content-Type: text/xml; charset="utf-8"
Content-Length: nnnn

<SOAP-ENV:Envelope
  xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
  SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/>
   <SOAP-ENV:Body>
       <m:GetUsersResponse xmlns:m="Get-Users">
         <users>
            <id>jsmith</id>
            <name>John Smith</name>
            <dogs>
               <dog>
                  <id>fido</id>
                  <name>Fido</name>
               </dog>
            </dogs>
         </users>
       </m:GetUsersResponse>
   </SOAP-ENV:Body>
</SOAP-ENV:Envelope>

REST vs SOAP

REST Pros

  • Aligned with HTTP specification, using it as an application level protocol (as intended)
  • Not as strict (depending on how you look at it)
  • Lightweight
  • Easily called from client-side (i.e. JavaScript)
  • Cacheable (GET operations)
  • Less verbose
  • Human readable
  • Easier to build (i.e. frameworks or roll your own) and test (e.g. directly in browser)
  • Can return multiple formats
  • No need to define a new arbitrary vocabulary (e.g. PUT = insert)
  • Widely supported (e.g. Amazon, Twitter, Facebook) — although many are only RESTish (not fully compliant)

REST Cons

  • No asynchronous processing or stateful support (reliability, security, transactions)
    • Alternative: Tokens (security only)
  • No support for exposing business logic
  • Only available over HTTP (point-to-point, performance, restricted verbs). SOAP supports SMTP and JMS.
  • No formal contract (i.e. strongly typed WSDL) because it is an architecture rather than a protocol
    • Alternative: WADL (although not widely supported)
  • No default end-to-end security like SOAP WS-Security
    • Alternative: OAuth
  • No binary encoding (i.e. attachments)
    • Alternative: MIME octet-stream base64 encoding

Conclusion

HTTP OK? AJAX support OK? Stateless OK? Simplicity OK? Go with RESTful or RESTish.

In my commercial experience endpoints rarely change — it’s more likely to be the query parameters (e.g. /books?status=read) or representations they return (e.g. adding a new field). There is also no value having new resources appear in responses as they will still need to be discovered (via documentation), understood (via documentation) and implemented by developers. I’m a fan of making things as simple as possible, which is why I personally prefer RESTful for direct human interaction (e.g. the web) and RESTish for B2B integration (e.g. web services).

Resources