An interesting web application infrastructure issue

Many web applications depend require cache control for pages, especially if
they involve user logons or time-dependant data.

Usually this is achieved with HTTP headers – something like (in JSP):

response.setHeader("Cache-Control", "no-cache");
response.setHeader("Expires","Tue, 30 May 1980 14:00:41 GMT");

An alternative, which usually work well is to require your site to be run
under HTTPS. In theroy, this seems ideal, since it provides security as well
as cache control.

However, beware of the impact of things like reverse-proxies. Many companies
are installing reverse proxies in front of their web hosting machines
to do request filtering in order to provide some
protection against SQL injection & XSS attacks on their websites. This is a
really good idea, but there can be some unexpected impacts.

One I didn't expect was the impact on caching. Because the proxy needs to
inspect the request, it decrypts it, then forwards the request as a HTTP
request to the server. Many vendors list this as a feature, because it
offloads some processing requirements tot he proxy-box, instead of the
webserver. The catch comes if you don't have explict cache control in your
pages AND the reverse proxy is a caching reverse proxy. In this case the
proxy may return the cached content to the user, which is NOT WHAT YOU WANT!

Another complicating factor is that some reverse proxies forward the original HTTP
1.1 requests to the server as HTTP 1.0, and seem to ignore HTTP 1.1 headers
that are returned. This can bite you if you only use the “Cache-Control”
(HTTP 1.1 only) header.

Lesson:
Always provide explict cache control AND expiry headers, and never rely on
HTTPS to control caching for you.

Leave a Reply

Your email address will not be published. Required fields are marked *