Difference between revisions of "Talk:Certified HTTP"

Latest revision as of 12:14, 4 November 2007

X-Message-ID

How about replacing the $random_uuid with an MD5 (or stronger) digest of the message body? That would eliminate the undefined result of sending a message with the same message id but a a different body. (Undefined results can be opportunities for exploits.) --Omei Turnbull 20:21, 10 July 2007 (PDT)

That wouldn't solve the problem, you would still be sending a message with the same message id if you sent two identical messages bodies. You would be guarantying a collision. It is really only an issue if two messages of the same ID are being processed at the same time. I think using $random_uuid is reasonable and in the event of a Message-ID collision or malformed Message-ID have the server return a 400. -- Strife Onizuka 00:01, 11 July 2007 (PDT)

Heh, looks like Strife and I saw this at the same moment. :-) I guess if you digested the entire state of the message, including headers, receiving url, and sending url/host, then you're only ruling out the case where you do actually want to send two of the exact same message between the same two hosts at the exact same time and have both be processed as independent message. Thanks for the idea! Which Linden 00:06, 11 July 2007 (PDT)

X-Message-URL

"If the client performs a GET to the message url prior to DELETE, the server must return the same body as the original response, including the X-Message-URL."

I think this may be an attack point, since it would require the server to cache the response body. There should also be a short timeout on the caching. This requirement makes it look like this protocol isn't something you want use with unauthorized clients. -- Strife Onizuka 00:18, 11 July 2007 (PDT)

How does this differ from the explicit requirement that "the server must persist the response body or have a mechanism to idempotent generate the same response to the same request"? But yes, it seems like the need to persist un-acknowledged response bodies for 15 days creates a big opportunity for a DDOS-type attack if reliable messages from unauthenticated clients are allowed. - Omei Turnbull

That would be a problem if it only statically persisted a complete response. If it has a way to generate the idempotent response at any later time, there is no need to statically persist the complete response. Dzonatas Sol 10:22, 11 July 2007 (PDT)

The entire response body has to be stored so that if the connection was terminated before the response was received it can be requested by sending a GET to X-Message-URL (instead of a DELETE). -- Strife Onizuka 15:27, 11 July 2007 (PDT)

That would be one way. There are other ways to generate a response that is idempotent. Dzonatas Sol 15:49, 11 July 2007 (PDT)

True, but it really depends on what is being requested and whether what is wanted is the old status or the current status. -- Strife Onizuka 16:11, 11 July 2007 (PDT)

Indeed, it depends on the intervals of when data is aggregated or archived. The complexity begins there. Dzonatas Sol 16:15, 11 July 2007 (PDT)

The issue I see (which I think is the same issue Strife was referring to) is that when the server gets a request, it has to check to see whether it is a duplicate of any outstanding request (i.e. a request for which the response hasn't been acknowledged) issued within the last 15 days, so that if it is, it will repeat its previous response. If these requests can come from untrusted clients, the server could purposely be subjected to requests that are not acknowledged, until the server is no longer able to respond to any requests in a timely manner.--Omei Turnbull 12:59, 11 July 2007 (PDT)

Thats what I was thinking. -- Strife Onizuka 15:27, 11 July 2007 (PDT)

Though there are certain things you could do to stave off DoS attacks like refusing to serve more than N connections at a time and refusing to even open a tcp connection to any future ones, I agree that without further consideration it's probably not a good idea to do chttp between hosts that don't trust each other. Which Linden 21:12, 3 October 2007 (PDT)

This could be mitigated by adding a header field in the client message saying whether this is an initial or a follow-up request. The server would not be required to check initial requests against its history of open requests. Follow-up requests, which would require a lookup against the open requests, could be done at reduced priority, so that they don't interfere with normal traffic.--Omei Turnbull 16:43, 11 July 2007 (PDT)

Wow, how did I not see this earlier? I think that the problem with this idea is that the client doesn't, in general, know whether it's making an initial or a follow up request. If it hasn't received a response from the server, it doesn't know whether the server received the message and didn't respond, or if the message never made it to the server in the first place (or was incomplete, as would happen if the network died mid-send). Which Linden 21:12, 3 October 2007 (PDT)

Thoughts

For a simple setup X-Message-URL could be the original request URL. The DELETE command should send the same X-Message-ID as the orig request to better facilitate this. This would have the added benefit that a client could possibly cancel a request by sending a DELETE before getting a response. This doesn't have to be all that complicated to implement. Once a request comes in, check to see if there is a DB entry for it, if it does not exist in the DB, execute the command and write the body to file and return the body. If it does exist in the DB already, check the status of the request, if there is a body, return it and wait for the timeout or DELETE command before removing the DB entry. -- Strife Onizuka 15:34, 11 July 2007 (PDT)

Cleanup

Limiting the number of concurrent requests per client is one way of reducing DOS risk but if a client crashes and reconnects then it will need some way of finding out what requests need to be closed. There should probably be some interface the client can query the server to find out what requests it has open. The response to such a query would give URLs to close those requests but they should not be able to return those request's response bodies as that could be a security breach. Additionally those requests returned should be tagged with the session that created them and the status of that session (so multiple sessions can use the same authorization information concurrently). -- Strife Onizuka 16:24, 11 July 2007 (PDT)

This is a really interesting point. We've given very little thought to communication with untrusted and unreliable clients. Right now, we are assuming that the client is a reliable service itself -- chttp is essentially a peer protocol between durable hosts. In particular, we want to assume delivery, which means that all the application logic doesn't have to contain a lot of failure cases (always a bugbear to write and test). The application just assumes that the message will eventually make it, despite temporary failures on both sending and receiving hosts. With those assumptions, it's reasonable to say that the client doesn't need to query the server to find out what requests it has open. It's not clear if the semantics even mean anything if the client is untrusted or unreliable. Which Linden 22:01, 11 July 2007 (PDT)

Perhaps there needs to be a way of identifing trusted hosts? Perhaps using HTTP authentication on servers that require identity? and if they can be identified, don't apply a limit to concurrent requests or at least increase the limit? Though this would surely make https as the perfered protocol in such a case, so authentication information is more secure. -- Nik Woodget 10:20, 19 September 2007

Could use client certificates, too. I think specifying an authentication method is outside the scope of certified http, though. Which Linden 21:59, 3 October 2007 (PDT)

As much as possible, things like authentication, should follow existing http style practice. I'd like to see it compose with the related specs. This seems like an overall good principle Zha Ewry 12:40 3 October 2007

Status Codes

I suggest should adjust interpretation of a few of the status codes.

304 Not Modified: I believe we should add a requirement that no cHTTP enabled server will return 304 to any request which include x-message-id. In order to maintain idempotence, we should typically return 200, but at least we should return 200 <= status < 300.
203 Non-Authoritative Information: I think we should consider this a success because the use case where this makes sense is if the server has a secondary persistent store which it considers good enough for this particular transaction. It also makes me nervous to think of anything in 2xx other than 202 as a retry.
406 Not Acceptable: Since the persisted client request theoretically includes the Accept header, there is no way that further calls will ever return anything other than 406. So shouldn't we fail sooner?
411 Length Required: A cHTTP dialogue requires Content-Length for every non-zero length message so this sound like a permanent failure to me.

Phoenix Linden 22:28, 1 November 2007 (PDT)

Hah, we should have looked at this page, since it already kinda does the work for us.

304: I don't think it really matters. If the client wants to add an If-Modified-Since header, why shouldn't the server accommodate it? I don't think there's a compelling reason to make chttp behave differently than standard http with regards to this status code.
203: Agreed.
406: The case where this might be usefully Ambiguous is when a server is temporarily incapacitated or misconfigured and throws 406s until it gets fixed. Not sure if that's plausible though.
411: Agreed.

Thanks for looking these over! I'm gonna wait for Sardonyx to weigh in before moving stuff around though.

Which Linden 23:21, 1 November 2007 (PDT)

304: Agree.
406: Maybe. Let's leave it ambiguous.

Phoenix Linden 11:14, 4 November 2007 (PST)

@@ Line 13: / Line 13: @@
 I think this may be an attack point, since it would require the server to cache the response body. There should also be a short timeout on the caching. This requirement makes it look like this protocol isn't something you want use with unauthorized clients. -- [[User:Strife Onizuka|Strife Onizuka]] 00:18, 11 July 2007 (PDT)
-::How does this differ from the explicit requirement that "the server must persist the response body or have a mechanism to indefinitely generate the same response to the same request"?  But yes, it seems like the need to persist un-acknowledged response bodies for 15 days creates a big opportunity for a DDOS-type attack if reliable messages from unauthenticated clients are allowed. - [[User:Omei_Turnbull|Omei Turnbull]]
+::How does this differ from the explicit requirement that "the server must persist the response body or have a mechanism to idempotent generate the same response to the same request"?  But yes, it seems like the need to persist un-acknowledged response bodies for 15 days creates a big opportunity for a DDOS-type attack if reliable messages from unauthenticated clients are allowed. - [[User:Omei_Turnbull|Omei Turnbull]]
 ::: That would be a problem if it only statically persisted a complete response. If it has a way to generate the idempotent response at any later time, there is no need to statically persist the complete response. [[User:Dzonatas Sol|Dzonatas Sol]] 10:22, 11 July 2007 (PDT)
 :::: The entire response body has to be stored so that if the connection was terminated before the response was received it can be requested by sending a GET to X-Message-URL (instead of a DELETE). -- [[User:Strife Onizuka|Strife Onizuka]] 15:27, 11 July 2007 (PDT)
+::::: That would be one way. There are other ways to generate a response that is idempotent. [[User:Dzonatas Sol|Dzonatas Sol]] 15:49, 11 July 2007 (PDT)
+:::::: True, but it really depends on what is being requested and whether what is wanted is the old status or the current status. -- [[User:Strife Onizuka|Strife Onizuka]] 16:11, 11 July 2007 (PDT)
+::::::: Indeed, it depends on the intervals of when data is aggregated or archived. The complexity begins there. [[User:Dzonatas Sol|Dzonatas Sol]] 16:15, 11 July 2007 (PDT)
 ::The issue I see (which I think is the same issue Strife was referring to) is that when the server gets a request, it has to check to see whether it is a duplicate of any outstanding request (i.e. a request for which the response hasn't been acknowledged) issued within the last 15 days, so that if it is, it will repeat its previous response.  If these requests can come from untrusted clients, the server could purposely be subjected to requests that are not acknowledged, until the server is no longer able to respond to any requests in a timely manner.--[[User:Omei Turnbull|Omei Turnbull]] 12:59, 11 July 2007 (PDT)
 :::Thats what I was thinking. -- [[User:Strife Onizuka|Strife Onizuka]] 15:27, 11 July 2007 (PDT)
+:::: Though there are certain things you could do to stave off DoS attacks like refusing to serve more than N connections at a time and refusing to even open a tcp connection to any future ones, I agree that without further consideration it's probably not a good idea to do chttp between hosts that don't trust each other.  [[User:Which Linden|Which Linden]] 21:12, 3 October 2007 (PDT)
+::This could be mitigated by adding a header field in the client message saying whether this is an initial or a follow-up request.  The server would not be required to check initial requests against its history of open requests.  Follow-up requests, which would require a lookup against the open requests, could be done at reduced priority, so that they don't interfere with normal traffic.--[[User:Omei Turnbull|Omei Turnbull]] 16:43, 11 July 2007 (PDT)
+::: Wow, how did I not see this earlier?  I think that the problem with this idea is that the client doesn't, in general, know whether it's making an initial or a follow up request.  If it hasn't received a response from the server, it doesn't know whether the server received the message and didn't respond, or if the message never made it to the server in the first place (or was incomplete, as would happen if the network died mid-send).  [[User:Which Linden|Which Linden]] 21:12, 3 October 2007 (PDT)
 == Thoughts ==
 For a simple setup X-Message-URL could be the original request URL. The DELETE command should send the same X-Message-ID as the orig request to better facilitate this. This would have the added benefit that a client could possibly cancel a request by sending a DELETE before getting a response.
 This doesn't have to be all that complicated to implement. Once a request comes in, check to see if there is a DB entry for it, if it does not exist in the DB, execute the command and write the body to file and return the body. If it does exist in the DB already, check the status of the request, if there is a body, return it and wait for the timeout or DELETE command before removing the DB entry. -- [[User:Strife Onizuka|Strife Onizuka]] 15:34, 11 July 2007 (PDT)
+== Cleanup ==
+Limiting the number of concurrent requests per client is one way of reducing DOS risk but if a client crashes and reconnects then it will need some way of finding out what requests need to be closed. There should probably be some interface the client can query the server to find out what requests it has open. The response to such a query would give URLs to close those requests but they should not be able to return those request's response bodies as that could be a security breach. Additionally those requests returned should be tagged with the session that created them and the status of that session (so multiple sessions can use the same authorization information concurrently). -- [[User:Strife Onizuka|Strife Onizuka]] 16:24, 11 July 2007 (PDT)
+:This is a really interesting point.  We've given very little thought to communication with untrusted and unreliable clients.  Right now, we are assuming that the client is a reliable service itself -- chttp is essentially a peer protocol between durable hosts.  In particular, we want to assume delivery, which means that all the application logic doesn't have to contain a lot of failure cases (always a bugbear to write and test). The application just assumes that the message will eventually make it, despite temporary failures on both sending and receiving hosts.  With those assumptions, it's reasonable to say that the client doesn't need to query the server to find out what requests it has open.  It's not clear if the semantics even mean anything if the client is untrusted or unreliable.  [[User:Which Linden|Which Linden]] 22:01, 11 July 2007 (PDT)
+::Perhaps there needs to be a way of identifing trusted hosts? Perhaps using HTTP authentication on servers that require identity? and if they can be identified, don't apply a limit to concurrent requests or at least increase the limit? Though this would surely make https as the perfered protocol in such a case, so authentication information is more secure. -- [[User:Nik Woodget|Nik Woodget]] 10:20, 19 September 2007
+::: Could use client certificates, too.  I think specifying an authentication method is outside the scope of certified http, though.  [[User:Which Linden|Which Linden]] 21:59, 3 October 2007 (PDT)
+::: As much as possible, things like authentication, should follow existing http style practice. I'd like to see it compose with the related specs. This seems like an overall good principle [[User: Zha Ewry|Zha Ewry]] 12:40 3 October 2007
+== Status Codes ==
+I suggest should adjust interpretation of a few of the status codes.
+; 304 Not Modified : I believe we should add a requirement that no cHTTP enabled server will return 304 to any request which include x-message-id. In order to maintain idempotence, we should typically return 200, but at least we should return 200 <= status < 300.
+; 203 Non-Authoritative Information : I think we should consider this a success because the use case where this makes sense is if the server has a secondary persistent store which it considers good enough for this particular transaction. It also makes me nervous to think of anything in 2xx other than 202 as a retry.
+; 406 Not Acceptable : Since the persisted client request theoretically includes the Accept header, there is no way that further calls will ever return anything other than 406. So shouldn't we fail sooner?
+; 411 Length Required : A cHTTP dialogue requires Content-Length for every non-zero length message so this sound like a permanent failure to me.
+[[User:Phoenix Linden|Phoenix Linden]] 22:28, 1 November 2007 (PDT)
+:Hah, we should have looked at [http://www.ilovejackdaniels.com/apache/http-status-codes-explained/ this page], since it already kinda does the work for us.
+:; 304 : I don't think it really matters.  If the client wants to add an If-Modified-Since header, why shouldn't the server accommodate it?  I don't think there's a compelling reason to make chttp behave differently than standard http with regards to this status code.
+:; 203 : Agreed.
+:; 406 : The case where this might be usefully Ambiguous is when a server is temporarily incapacitated or misconfigured and throws 406s until it gets fixed.  Not sure if that's plausible though.
+:; 411 : Agreed.
+:Thanks for looking these over!  I'm gonna wait for Sardonyx to weigh in before moving stuff around though.
+:[[User:Which Linden|Which Linden]] 23:21, 1 November 2007 (PDT)
+::; 304: Agree.
+::; 406: Maybe. Let's leave it ambiguous.
+::[[User:Phoenix Linden|Phoenix Linden]] 11:14, 4 November 2007 (PST)