6

I'm trying to determine the best practice in a REST API for determining whether the client can access a particular resource. Two quick example scenarios:

A phone directory lookup service. Client looks up a phone number by accessing eg.
GET http://host/directoryEntries/numbers/12345
... where 12345 is the phone number to try and find in the directory. If it exists, it would return information like the name and address of the person whose phone number it is.

A video format shifting service. Client submits a video in one format to eg.
POST http://host/videos/
... and receives a 'video GUID' which has been generated by the server for this video. Client then checks eg.
GET http://host/videos/[GUID]/flv
... to get the video, converted into the FLV format, if the converted version exists.

You'll notice that in both cases above, I didn't mention what should happen if the resource being checked for doesn't exist. That's my question here. I've read in various other places that the proper RESTful way for the client to check whether the resource exists here is to call HEAD (or maybe GET) on the resource, and if the resource doesn't exist, it should expect a 404 response. This would be fine, except that a 404 response is widely considered an 'error'; the HTTP/1.1 spec states that the 4xx class of status code is intended for cases in which the client 'seems to have erred'. But wait; in these examples, the client has surely not erred. It expects that it may get back a 404 (or others; maybe a 403 if it's not authorized to access this resource), and it has made no mistake whatsoever in requesting the resource. The 404 isn't intended to indicate an 'error condition', it is merely information - 'this does not exist'.

And browsers behave, as the HTTP spec suggests, as if the 404 response is a genuine error. Both Google Chrome and Firebug's console spew out a big red "404 Not Found" error message into the Javascript console each time a 404 is received by an XHR request, regardless of whether it was handled by an error handler or not, and there is no way to disable it. This isn't a problem for the user, as they don't see the console, but as a developer I don't want to see a bunch of 404 (or 403, etc.) errors in my JS console when I know perfectly well that they aren't errors, but information being handled by my Javascript code. It's line noise. In the second example I gave, it's line noise to the extreme, because the client is likely to be polling the server for that /flv as it may take a while to compile and the client wants to display 'not compiled yet' until it gets a non-404. There may be a 404 error appearing in the JS console every second or two.

So, is this the best or most proper way we have with REST to check for the existence of a resource? How do we get around the line noise in the JS console? It may well be suggested that, in my second example, a different URI could be queried to check the status of the compilation, like:
GET http://host/videos/[GUID]/compileStatus
... however, this seems to violate the REST principle a little, to me; you're not using HTTP to its full and paying attention to the HTTP headers, but instead creating your own protocol whereby you return information in the body telling you what you want to know instead, and always return an HTTP 200 to shut the browser up. This was a major criticism of SOAP - it tries to 'get around' HTTP rather than use it to its full. By this principle, why does one ever need to return a 404 status code? You could always return a 200 - of course, the 200 is indicating that the a resource's status information is available, and the status information tells you what you really wanted to know - the resource was not found. Surely the RESTful way should be to return a 404 status code.

This mechanism seems even more contrived if we apply it to the first of my above examples; the client would perhaps query:
GET http://host/directoryEntries/numberStatuses/12345
... and of course receive a 200; the number 12345's status information exists, and tells you... that the number is not found in the directory. This would mean that ANY number queried would be '200 OK', even though it may not exist - does this seem like a good REST interface?

Am I missing something? Is there a better way to determine whether a resource exists RESTfully, or should HTTP perhaps be updated to indicate that non-2xx status codes should not necessarily be considered 'errors', and are just information? Should browsers be able to be configured so that they don't always output non-2xx status responses as 'errors' in the JS console?

PS. If you read this far, thanks. ;-)

Community
  • 1
  • 1
Jez
  • 27,951
  • 32
  • 136
  • 233

4 Answers4

4

It is perfectly okay to use 404 to indicate that resource is not found. Some quotes from the book "RESTful Web Services" (very good book about REST by the way):

404 indicates that the server can’t map the client’s URI to a resource. [...] A web service may use a 404 response as a signal to the client that the URI is “free”; the client can then create a new resource by sending a PUT request to that URI. Remember that a 404 may be a lie to cover up a 403 or 401. It might be that the resource exists, but the server doesn’t want to let the client know about it.

Use 404 when service can't find requested resource, do not overuse to indicate the errors which are actually not relevant to the existence of resource. Also, client may "query" the service to know whether this URI is free or not.

Performing long-running operations like encoding of video files

HTTP has a synchronous request-response model. The client opens an Internet socket to the server, makes its request, and keeps the socket open until the server has sent the response. [...]

The problem is not all operations can be completed in the time we expect an HTTP request to take. Some operations take hours or days. An HTTP request would surely be timed out after that kind of inactivity. Even if it didn’t, who wants to keep a socket open for days just waiting for a server to respond? Is there no way to expose such operations asynchronously through HTTP?

There is, but it requires that the operation be split into two or more synchronous requests. The first request spawns the operation, and subsequent requests let the client learn about the status of the operation. The secret is the status code 202 (“Accepted”).

So you could do POST /videos to create a video encoding task. The service will accept the task, answer with 202 and provide a link to a resource describing the state of the task.

202 Accepted
Location: http://tasks.example.com/video/task45543

Client may query this URI to see the status of the task. Once the task is complete, representation of resource will become available.

galymzhan
  • 5,505
  • 2
  • 29
  • 45
  • 1
    "It is perfectly okay to use 404 to indicate that resource is not found" - OK, but what would your suggestion be about the fact that it causes the browser to pollute the JS console with a load of 404 errors? And, is the HTTP/1.1 spec wrong to say that a 4xx code indicates that the client has erred in some way? – Jez Apr 21 '12 at 15:12
  • 1
    That's not a fault of client-side JS code, it's browser-specific thing. [You can disable 404 errors](http://stackoverflow.com/a/7429313/450449). Alternatively, you can use the approach with task queue, which avoids 404's – galymzhan Apr 21 '12 at 15:47
2

I think you have changed the semantics of the request. With a RESTful architecture, you are requesting a resource. Therefore requesting a resource that does not exist or not found is considered an error.

I use:

  • 404 if GET http://host/directoryEntries/numbers/12345 does not exist.

  • 400 is actually a bad request 400 Bad Request

Perhaps, in your case you could think about searching instead. Searches are done with query parameters on a collection of resources

What you want is GET http://host/directoryEntries/numbers?id=1234 Which would return 200 and an empty list if none exist or a list of matches.

jermel
  • 2,326
  • 21
  • 19
  • 1
    The search option is a nice solution to stop the browser giving the 404 error in the console, I agree... but it seems like just another (perhaps less RESTful) way of just requesting `/numbers/12345`. – Jez Apr 21 '12 at 14:32
  • They mean similar, but different things. The first case is using a URI (ie the actual number resource which does not exist). The second is querying a URI (numbers) which does exist and returning the matches which may be empty – jermel Apr 21 '12 at 14:40
2

IMO the client has indeed erred in requesting a non-existent resource. In both your examples the service can be designed in a different way so an error can be avoided on the client side. For example, in the video conversion service as the GUID has already been assigned, the message body at videos/id can contain a flag indicating whether the conversion was done or not.

Similarly, in the phone directory example, you are searching for a resource and this can be handled through something like /numbers/?search_number=12345 etc. so that the server returns a list of matching resources which you can then query further.

Browsers are designed for working with the HTTP spec and showing an error is a genuine response (pretty helpful too). However, you need to think about your Javascript code as a separate entity from the browser. So you have your Javascript REST client which knows what the service is like and the browser which is sort of dumb with regards to your service.

Also, REST is independent of protocols in theory. HTTP happens to be the most common protocol where REST is used. Another example I can think of is Android content providers whose design is RESTful but not dependent on HTTP.

Abhinav
  • 38,516
  • 9
  • 41
  • 49
  • As I said to jermel, `?search_number=12345` is a nice solution to stop the browser putting errors into the console, but as `12345` is to be a unique ID, it just seems like another (perhaps less RESTful) way of saying `/12345`. – Jez Apr 21 '12 at 14:38
  • Actually no. You are sending a modifier param to the resource list at /numbers/. It's the same way you would search for geographic locations like /locations/?lat=12.0&lng=76.0&radius=100.0. – Abhinav Apr 21 '12 at 14:45
0

I've only ever seen GET/HEAD requests return 404 (Not Found) when a resource doesn't exist. I think if you are trying to just get a status of a resource a head request would be fine as it shouldn't return the body of a resource. This way you can differentiate between requests where you are trying to retrieve the resource and requests where you are trying to check for their existance.

http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html

Edit: I remember reading about an alternative solution by adding a header to the original request that indicated how the server should handle 404 errors. Something along the lines of responding with 200, but an empty body.

Joshua Dale
  • 1,773
  • 3
  • 17
  • 25
  • But the real trouble I have (both with GET and HEAD) is that a 404 is considered an error, and the browser acts that way, cluttering up my JS console with useless 404 error messages. I only want legitimate errors in the JS console. – Jez Apr 21 '12 at 14:12
  • Sure, I guess the difference is how you want the client to respond to non existant resources vs. what is logically correct REST on the server. I remember reading something about this in a book, but I have too many REST books! :) If I can find it, I'll edit my question. Off the top of my head I think it might have had something to do with adding a header to the request on how the client wanted the server to handle 404 errors. Something along the lines of responding with 200, but including an empty body to indicate the resource wasn't found. – Joshua Dale Apr 21 '12 at 16:44