Ever since the introduction of long running collection API calls, it has been often noticed that the calls TIMEOUT every now and then. Calls like ShardSplit timeout without much information on the state of the request. Though most calls are practically idempotent, it would still be much better for users to know if a request is currently in progress, failed or actually completed after the timeout duration.
This brought me to start working on asynchronous calls for OverseerCollectionProcessor (SOLR-5477). The code is already in both trunk and branch_4x branches of Apache Lucene/SOLR and should be released with 4.8.
Having support for asynchronous Collection API calls does not seem as trivial as it sounds. The intention, while designing this feature was to not lose a call somewhere in between. The general flow of a CollectionAPI call involves multiple CoreAdmin calls too. Not handling those i.e. not having them async would translate to a time-out for the internal CoreAdmin calls, pretty much defeating the purpose.
The overall design of async Collection API calls involve the following API components:
- Async call API: Extra parameter ‘async=xx’ for Collection API calls that makes the calls async,
- Request Status Collection API to check the status of a pre-submitted async call,
- CoreAdmin async call support,
- CoreAdmin request status API.
The Collection API ‘async’ call:
Here’s how an async SPLITSHARD call looks like: