HTTP Content Signature Header ----------------------------- Version: 2008-06-29 Status: Brainstorming. Overview: In a world where some ISPs are starting to fiddle with the contents of data downloaded by their customers[1], it'd be useful to be able to detect any manipulation of data and to alert the user. Currently there is a Content-MD5 header to verify that data is delivered unchanged, but it's easy for a malicious party with access to the network to also alter the MD5 sum when altering content data. To prevent this it's necessary to be able to verify the origin on the data. The only current way to prevent this is to use HTTPS for all transfers, but that requires more resources on the server side, especially for sites with lots of traffic. Using OpenPGP[2] signatures it's possible to use plain old HTTP and still be able to detect altered data. There is some initial network overhead since public keys will need to be downloaded, but some keys could be distributed with the browser (for example keys for Google, MSN and a few more high traffic sites) and once a key has been downloaded it stays in the local cache. Using this approach, it's possible for caches along the way to store data, and it's possible to use pools of web servers with different IP addresses. If the signatures are pre-computed and stored alongside the content on the origin servers, these won't even have to have access to the private key. Basic structure of header: Content-Signature: [base64 encoded signature];[base64 encoded signature checksum] Example header: Content-Signature: iD8DBQFIYQCSi0P7OS4VvkwRAm7nAKC1Ra4RmhtgPFEIckxu0uACoVWVIwCg0u2B5u2gS2tSO7LXagplAF+AwI0=;=FfiF Compare to a normal PGP signature: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 <html><body></body></html> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (Darwin) iD8DBQFIYQCSi0P7OS4VvkwRAm7nAKC1Ra4RmhtgPFEIckxu0uACoVWVIwCg0u2B 5u2gS2tSO7LXagplAF+AwI0= =FfiF -----END PGP SIGNATURE----- Verifying that a key is allowed to sign content: In the signature the signing key is identified using a key id. Before trusting the signature, the browser needs to verify that the key is indeed associated with the web site in question. This is fairly tricky, since an unencrypted network connection cannot be used to verify it (remember, we're assuming an evil ISP). To get around this problem, we add an additional HTTP header Signature-Keys that points to a list of trusted keys that can be downloaded over a HTTPS connection. By doing this, we can be sure that we get a valid list (assuming we trust HTTPS connections). E.g.: Signature-Keys: https://keys.someserver.com/trusted_keys.txt The list of trusted keys has to be located on the same host as the content to be verified, or one of its super domains. This means that google.com could be used for any *.google.com domain, but not for microsoft.com. To allow users without access to HTTPS enabled servers to sign content, it might be necessary to allow certain exceptions to this rule so that a third party can maintain key lists for non-HTTPS domains. Only a small number of such exceptions can be allowed, and the exceptions must be clearly defined. The structure of the key list is simple: Each line consists of a trust level, a key id, and a domain pattern. Allowed trust levels are 'trusted' and 'untrusted'. Listing a key as untrusted can signify that is has once been but is no longer trusted, or to assign a set of keys to different parts of a domain. E.g. https://someserver.com/trusted_keys.txt: untrusted AABBCC11 *.someserver.com # old key that got leaked trusted BBCC33DD www.someserver.com # web key trusted CC9988EE *.someserver.com # generic key untrusted CC9988EE images.someserver.com # generic key isn't trusted to sign images Text after a # sign is ignored as a comment. The domain pattern is very basic: it's either a specific domain, or a domain prefixed with "*.", meaning that it's valid for that domain and all sub-domains. Once a list of trusted keys has been downloaded for a domain, it can be cached for as long as allowed by standard HTTP cache rules. Future downloads from the same domain can use the cached version of the list, assuming that the new content uses one of the keys in the list. Alternately, the Signature-Keys header can be sent in HTML content using a meta tag. E.g. contents of URL http://www.someserver.com/index.html: <html> <head> <meta http-equiv="Signature-Keys" content="https://someserver.com/trusted_keys.txt" /> </head> ... </html> Note that the Content-Signature header cannot be sent as a meta tag, since that would require the signature to sign itself. It could be argued that a signature could be calculated with the content attribute of the meta tag set to "", but that would be somewhat complicated in practice. It might be worth the effort though, to allow signing HTML content stored on servers that lack Content-Signature support. Implementation: The client part of the system can be implemented as a browser plugin in browsers that support plugins, or of course natively in the browser. The main thing to think about here is how to alert the user if there is a signature mismatch. On the server side the best course of action depends a bit on the size and nature of the content to be signed. For static files it's trivial to calculate and store the signatures as files are retrieved. For dynamic content it's not as simple, but for small to medium size content it's possible to store the output of a script in RAM and generate the signature before data is sent to the client. For larger content it will be up to the script to generate and add the signature. References: [1] http://www.theregister.co.uk/2008/06/23/topolski_takes_on_nebuad/ [2] http://www.openpgp.org/ Document maintainer: mikael@eiman.tv