A long-running task is an operation that requires a considerable amount of server resources and/or time. To avoid blocking the client, the task must be completed asynchronously without the persistent connection between the client and the server.
After submitting the task, the client needs to poll to a provided URL for the task-execution progress. If there is WebSockets support, the client can also be notified via WebSockets when execution finishes, instead of polling.
Although Roy Fielding has not mentioned anything about the long-running tasks in his dissertation, we can combine the REST principles with other HTTP RFCs to design a viable solution.
1. HTTP Status Code
The most suitable status code to return from a long-running REST API is HTTP 202 Accepted. Once the request has been accepted, there is no facility for resending a status code from the submitted asynchronous operation.
According to RFC-9110 (obsoletes RFC-7231), the 202 (Accepted) status code indicates that the request has been accepted for processing, but the processing has not been completed.
The representation sent with 202 status should describe the request’s current status (generally it is submitted) and Location
header to a status monitor that can provide the client with an estimate of when the request will be fulfilled.
2. API Response
The response from a long-running API should contain only the necessary information about the current status of the submitted task. The following is an example of such an API that accepts scripts to run on the devices.
HTTP POST: /device-management/script-execution/new
{
"device-ids": [1, 2, 3],
"script-url": "/temp/test-script.sh"
}
The response to the above request can be as follows where 123456789 is a random number denoting the id of the long-running task in progress. There can be multiple such tasks executing on the server, at any time.
HTTP Status 202
Location: /device-management/script-execution/123456789
{
"device-ids": [1, 2, 3],
"script-url": "/temp/test-script.sh",
"status": "SUBMITTED"
}
3. Completion Status of Long-Running Task
After the task has been submitted, the client can poll to the URL provided in Location
header and get the current status of the long-running task. A sample response body can be:
HTTP GET: /device-management/script-execution/123456789
{
"device-ids": [1, 2, 3],
"script-url": "/temp/test-script.sh",
"status": "INPROGESS",
"percentage": "45%",
}
The task completion status and percentage can change based on the execution progress. We can use other status constants and process indicators as well, depending on project requirements.
Once the task is finished, it can either provide the task execution result in the same response, or it can provide another URL that will return the task execution result.\
HTTP GET: /device-management/script-execution/123456789
{
"device-ids": [1, 2, 3],
"script-url": "/temp/test-script.sh",
"status": "COMPLETE",
"percentage": "100%",
"result": {
"id": 123456789,
"sys-log-location":"/log/….",
"err-log-location":"/log/….",
"success-on-devices": [1, 2],
"failed-on-devices": [3]
}
}
Alternatively, the execution status response can point to a new location for accessing the result.
HTTP GET: /device-management/script-execution/123456789
{
"device-ids": [1, 2, 3],
"script-url": "/temp/test-script.sh",
"status": "COMPLETE",
"percentage": "100%",
"result": "/device-mamangement/devices/execute-scripts/123456789/result"
}
HTTP GET: /device-management/script-execution/123456789/result
{
"id": 123456789,
"sys-log-location":"/log/….",
"err-log-location":"/log/….",
"success-on-devices": [1, 2],
"failed-on-devices": [3]
}
Also, consider using real-time messaging systems (such as Apache Kafka) for publishing the task status, which can notify the clients if they have subscribed to it. It generally depends on the type of client:
- An API client can dynamically subscribe to the Topic URL in the Location header therefore we can use a message queue in communications between TWO API clients.
- For communication between a browser and the server-hosted API, a simple REST-style API response will be more suitable.
4. Canceling an In-Progress Task
A task can be submitted by-mistake so there must be a way to cancel such a task to prevent further damage to the system. The request can be cancelled partially or fully. The changes by the tasks, until they are cancelled, can be persistent or rollbacked. All these decisions depend on the application requirements and capabilities.
The cancel operation will be idempotent.
A client can send the HTTP DELETE request on the URL provided by Location header when the task was submitted. The URL contains the task-id/execution-id so it can be cancelled using it.
HTTP DELETE "/device-management/script-execution/123456789"
5. Best Practices
- Do not wait for long-running tasks to complete as part of ordinary HTTP request processing.
- Provide dedicated URLs to query the task status.
- Provide a mechanism to cancel a long-running task.
- The task execution process should not depend on the client in any way.
- Consider using
Retry-After
header field in the API response to indicate how long the user agent ought to wait before retrying the same request if the previous request was not accepted for any reason. - Consider using RFC 7807 [Problem Details for HTTP APIs] specification when returning an error response.