Blog

- May 18, 2015

In a previous post, we describe the best practices of API design (i.e. RESTful service). It deals with foundations on the way to implement REST principles.

The latter mainly focuses on the way to implement CRUD operations but doesn’t tackle bulk updates. The aspect is really important since it allows to minimize the interactions with the server since they correspond to network round trips. Today, we will tackle this aspect and the way to implement it independently from the server side implementations.

We use the contact sample all along the post that we will extend to support bulk operations.

Implementing bulk adds

In our previous post, we explored in detail what is API design and how to add a single contact within a Web API. We will describe here how to extend this feature to support the addition of a set of contacts. To do so, an API testing tool is recommended to send requests and see the responses of your API.

Request

Commonly, collection resource already use the POST method to add a single element to the collection. That’s why we need to implement a mechanism to support several insertions on a same POST method. As a matter of fact, having another resource that uses an action name in resource path like /contacts/bulk isn’t really RESTful, so it’s not the right approach.

Two approaches can be considered to support several actions and contents on the same POST method:

  • Content based. The collection resource accepts both single element and collection of elements for its method. According to the input payload, the processing detects if a single or a bulk add must be done.
  • Action identifier based. An identifier of the action to handle is provided within the request using, for example, a custom header.

For the first approach, we can have the following request in the case of a single element:

POST /contacts
Content-Type: application/json
{
    "firstName": "my first name",
    "lastName": "my last name",
    (...)
}

And the following, in the case of a bulk update:

POST /contacts
Content-Type: application/json
[
    {
        "firstName": "my first name (1)",
        "lastName": "my last name (1)"
        (...)
    },
    {
        "firstName": "my first name (2)",
        "lastName": "my last name (2)"
        (...)
    },
    (...)
]

For the second approach, we can have the following request in the case of a single element:

POST /contacts
X-Action: single
Content-Type: application/json
{
    "firstName": "my first name",
    "lastName": "my last name",
    (...)
}

And the following, in the case of a bulk update:

POST /contacts
X-Action: bulk
[
    {
        "firstName": "my first name (1)",
        "lastName": "my last name (1)"
        (...)
    },
    {
        "firstName": "my first name (2)",
        "lastName": "my last name (2)"
        (...)
    },
    (...)
]

Response

With a single add, the response is quite straightforward and commonly contains two things:

  • A status code 201 (Created)
  • An header Location contained the URL of the newly created element

The following snippet describes the content of such response:

201 Created
Location: http://(...)/elements/generated-id

In the context of bulk add, things need to be a bit adapted. The header accepts one value and can be defined once within a response.

That said, since the semantics of a POST method is up to the RESTful service designer, we can leverage the header Link to provide this hint, as described below:

201 Created
Link: <http://(...)/elements/generated-id1>, <http://(...)/elements/generated-id2>

Note about status code 202 that it is particularly applicable here since bulk add can be handled asynchronously. In such a case, we need do pull a dedicated resource to know the status of this processing.

Such approach can only work if we consider that the processing is transactional:

  • Everything works fine and all data is inserted,
  • At least one element has validation errors and nothing is added,
  • One or more insert fails and everything is rollbacked.

In this case, if there are some validation errors, the response could be as described below:

422 Unprocessable Entity
Content-type: application/json
[
    {
        "index": 1,
        "messages": [
            {
                "firstName": "The fist name should at least have three characters."
             }
        ]
    },
    {
        "index": 1,
        "messages": [
            {
                "id": "The value of the field it isn't unique."
             }
        ]
    },
]

In the case of insertion errors:

500 Internal Server Error
Content-type: application/json
[
    {
        "index": 1,
        "messages": [
            "The contact can't be added because of the error #22 (description)"
        ]
    },
    (...)
]

For non-transactional processing, we need to return the result of bulk add by element contained in the request payload. The status code of the response will always be 200 and errors, if any, described in the response payload, as shown below:

200 OK
Content-type: application/json
[
    {
        "index": 1,
        "status": "error",
        "messages": [
            "The contact can't be added because of the error #22 (description)"
        ]
    },
    {
        "index": 2,
        "status": "success",
        "auto-generated-id": "43"
    },
    (...)
]

Another approach can be to replace the whole collection representation of a list resource.

Implementing bulk replace

The method PUT can be also used on a collection resource. In this case, this means that we want to completely replace the content of the collection associated with the resource with a new one. This can be defined with an API design tool.

Request

We can simply send the whole collection content, as described below:

PUT /contacts
Content-Type: application/json
[
    {
        "firstName": "my first name (1)",
        "lastName": "my last name (1)"
        (...)
    },
    {
        "firstName": "my first name (2)",
        "lastName": "my last name (2)"
        (...)
    },
    (...)
]

Response

This approach requires to be transactional: either the representation is replaced, or it isn’t.

If the request is successful, we can simply have the following response:

204 No Content

In the case of errors, we can have similar contents as with the bulk additions described earlier. For example:

422 Unprocessable Entity
Content-type: application/json
[
    {
        "index": 1,
        "messages": [
            {
                "firstName": "The fist name should at least have three characters."
             }
        ]
    },
    {
        "index": 1,
        "messages": [
            {
                "id": "The value of the field it isn't unique."
             }
        ]
    },
]

Before ending this section, we need to deal with the subject of auto-generated identifiers of elements. As a matter of fact, when providing a list content, the Web API might need to do some inserts into the store and have the strategy to auto-generate identifiers. We reach here the limit of the feature since a method PUT needs to be idempotent and with auto-generated identifiers, we won’t have the exact same list representation content if we send again the same request. For that reason, we should use another approach for such a use case.

HTTP also provides the PATCH method that allows to implement partial updates of the state of a resource.

Implementing bulk updates

In this section, we will tackle the way to implement bulk updates based on the HTTP method PATCH. The latter targets partial updates of resource states and is particularly suitable for bulk updates. Let’s see how to handle this situation.

Request

Using the PATCH method is from far the most convenient way to partially update the collection associated with a resource. We don’t have to send the whole collection, but only the elements we want to update. In fact, such approach allows to control which updates we want to do (add, update or delete). Whereas we are free to use the format we want to describe the operations to execute on data, some standard formats are however available.

JSON Patch can be used in this context. It corresponds to a JSON document structure for expressing a sequence of operations to apply to a JSON document. A similar XML format named XML patch is also available.

We use below the JSON Patch format in JSON.

The content provided corresponds to an array of JSON structures that can have the attributes:

  • Attribute op that describes the operation on the described element. In our context, values add, remove and update are relevant.
  • Attribute path that allows to identify the element involved within the JSON document. This attribute can be omitted in the case of an add operation.
  • Attribute value that contains the content of the element to use for the operation

The following code describes how to update the list of contacts by adding a new one and removing the one with identifier 1:

PATCH /contacts
[
    {
        "op": "add", "value": {
            "firstName": "my first name",
            "lastName": "my last name"
        }
    },
    {
        "op": "remove", "path": "/contacts/1"
    }
]

Response

Regarding the response, we are exactly in the same scenario as for the element additions. We can consider that the bulk updates are transactional or not. We can notice that the specification JSON Patch doesn’t advise anything regarding the response content. It’s up to you to use the most appropriated format.

In the case of a transactional approach, we have the following scenario:

  • Everything works fine and all data is inserted,
  • At least one element has validation errors and nothing is added,
  • One or more insert fails and everything is rollbacked.

In such case, we can have a response content as described below:

200 OK
Content-type: application/json
[
    {
        "index": 1,
        "status": "error",
        "messages": [
            "The contact can't be added because of the error #22 (description)"
        ]
    },
    {
        "index": 2,
        "status": "skipped"
    },
    (...)
]

In the case of a non transactional approach, elements can be added unitarily, even if some of them cant be added because of errors. The response content would be:

200 OK
Content-type: application/json
[
    {
        "index": 1,
        "status": "error",
        "messages": [
            "The contact can't be added because of the error #22 (description)"
        ]
    },
    {
        "index": 2,
        "status": "success",
        "auto-generated-id": "43"
    },
    (...)
]

Summary

In this article, we talked about how to handle bulk updates of collections. Rather than sending several unitary updates for each and every entity we want to update, we advised different approaches to do that in a single request, saving precious network back and forth requests and precious bandwidth. We also learned about JSON Patch, an interesting format to streamline your payloads, which lets you send just differences rather than the whole content.

Note: This article was originally published on Thierry’s blog.

  • Glen Goffin

    This is an excellent overview and discussion of the options around RESTful bulk updates. Thank you for writing it. It seems like the preferred method (at least for me) would be POST of multiple records with itemized status response when there are errors. PATCH seems heavy-weight and modifying headers seems unnecessary.

    Doesn’t the same problem around itemized status response exist even for single record PUT of a complex object? I could fail an update because person.lastName was an INT or something. If the object has tons of attributes, any one of them could fail to validate.

    Thanks again for a stimulating post.

  • shreyas deshpande

    thanks for the article. we have a additional check to perform . in case of bulk update through put or patch , we need to match etag of all the resources present in bulk. if there is a mismatch someone has edited resource. how to handle such cases ?

  • Raj

    @Shreyas etag mismatch should be handled same as insertion failed. In that case you return failure for that item.

  • Harry Yuan

    I don’t think you can POST to the same end point, one time with a single object,one time with a list of objects. Even if you overload your java method, I don’t think you can map 2 methods to the same endpoint, with the same HTTP verb. How did you solve this problem?

  • Russell Horwood

    PUT is an upsert operation so it’s possible that some entities in a bulk are created and some are updated and the operation is successful. THE RFC states that if a PUT operation creates the entity it SHOULD return a 201 created response, or if updated a 204 no-content or 200 ok response. For bulk PUT would you return 201 is all entities were created, otherwise 200/204. Or would you always return a 200/204?