mod_perl offers the $r->set_etag( ) method if
we have use( ) ed Apache::File.
However, we strongly recommend that you don't use
the set_etag( ) method! set_etag(
) is meant to be used in conjunction with a static request
for a file on disk that has been stat( ) ed in the
course of the current request. It is inappropriate and dangerous to
use it for dynamic content.
By sending an entity tag we are promising the recipient that we will
not send the same ETag for the same resource again
unless the content is "equal" to
what we are sending now.
The pros and cons of using entity tags are discussed in section 13.3
of the HTTP specification. For mod_perl programmers, that discussion
can be summed up as follows.
There are strong and weak validators. Strong validators change
whenever a single bit changes in the response; i.e., when anything
changes, even if the meaning is unchanged. Weak validators change
only when the meaning of the response changes. Strong validators are
needed for caches to allow for sub-range requests. Weak validators
allow more efficient caching of equivalent objects. Algorithms such
as MD5 or SHA are good strong validators, but what is usually
required when we want to take advantage of caching is a good weak
HTTP Range Requests
It is possible in web clients to interrupt the connection before the
data transfer has finished. As a result, the client may have partial
documents or images loaded into its memory. If the page is reentered
later, it is useful to be able to request the server to return just
the missing portion of the document, instead of retransferring the
There are also a number of web applications that benefit from being
able to request the server to give a byte range of a document. As an
example, a PDF viewer would need to be able to access individual
pages by byte range—the table that defines those ranges is
located at the end of the PDF file.
In practice, most of the data on the Web is represented as a byte
stream and can be addressed with a byte range to retrieve a desired
portion of it.
For such an exchange to happen, the server needs to let the client
know that it can support byte ranges, which it does by sending the
The server will send this header only for documents for which it will
be able to satisfy the byte-range request—e.g., for PDF
documents or images that are only partially cached and can be
partially reloaded if the user interrupts the page load.
The client requests a byte range using the Range
Because of the architecture of the byte-range request and response,
the client is not limited to attempting to use byte ranges only when
this header is present. If a server does not support the
Range header, it will simply ignore it and send
the entire document as a response.
A Last-Modified time, when used as a validator in
a request, can be strong or weak, depending on a couple of rules
described in section 13.3.3 of the HTTP standard. This is mostly
relevant for range requests, as this quote from section 14.27
If the client has no entity tag for an entity, but does have a Last-Modified date, it
MAY use that date in an If-Range header.
But it is not limited to range requests. As section 13.3.1 states,
the value of the Last-Modified header can also be
used as a cache validator.
The fact that a Last-Modified date may be used as
a strong validator can be pretty disturbing if we are in fact
changing our output slightly without changing its semantics. To
prevent this kind of misunderstanding between us and the cache
servers in the response chain, we can send a weak validator in an
ETag header. This is possible because the
If a client wishes to perform a sub-range retrieval on a value for which it has only
a Last-Modified time and no opaque validator, it MAY do this only if the Last-
Modified time is strong in the sense described here.
In other words, by sending an ETag that is marked
as weak, we prevent the cache server from using the
Last-Modified header as a strong validator.
An ETag value is marked as a weak validator by
prepending the string W/ to the quoted string;
otherwise, it is strong. In Perl this would mean something like this:
Consider carefully which string is chosen to act as a validator. We
are on our own with this decision:
... only the service author knows the semantics of a resource well enough to select
an appropriate cache validation mechanism, and the specification of any validator
comparison function more complex than byte-equality would open up a can of worms.
Thus, comparisons of any other headers (except Last-Modified, for compatibility with
HTTP/1.0) are never used for purposes of validating a cache entry.
If we are composing a message from multiple components, it may be
necessary to combine some kind of version information for all these
components into a single string.
If we are producing relatively large documents, or content that does
not change frequently, then a strong entity tag will probably be
preferred, since this will give caches a chance to transfer the
document in chunks.