Thumbnail Saving

The thumbnail filename is determined by a hashfunction. This proposal utilizes MD5 as hash mechanism in the following way.

  1. You need the absolute canonical URI for the original file, as stated in URI RFC 2396. In particular this defines to use three '/' for local 'file:' resources (see example below).

  2. Calculate the MD5 hash for this URI. Not for the file it points to! This results in a 128bit hash, which is representable by a hexadecimal number in a 32 character long string.

  3. To get the final filename for the thumbnail just append a '.png' to the hash string. According to the dimension of the thumbnail you must store the result either in $XDG_CACHE_HOME/thumbnails/normal or $XDG_CACHE_HOME/thumbnails/large.

An example will illustrate this:

Example 1. Saving a thumbnail

Consider we have a file ~/photos/me.png. We want to create a thumbnail with a size of 128x128 pixel for it, which means it will be stored in the $XDG_CACHE_HOME/thumbnails/normal directory. The absolute canonical URI for the file in this example is file:///home/jens/photos/me.png.

The MD5 hash for the uri as a hex string is c6ee772d9e49320e97ec29a7eb5b1697. Following the steps above this results in the following final thumbnail path:

/home/jens/.cache/thumbnails/normal/c6ee772d9e49320e97ec29a7eb5b1697.png

Permissions

A few words regarding permissions: All the directories including the $XDG_CACHE_HOME/thumbnails directory must have set their permissions to 700 (this means only the owner has read, write and execute permissions, see "man chmod" for details). Similar, all the files in the thumbnail directories should have set their permissions to 600. This way we assure that if a user creates a thumbnail for a file where only he has read-permissions no other user can take a glance on it through the backdoor with the thumbnails.

Programs should first check that the original image file is readable. If it is not, the program should not attempt to read a thumbnail from the cache, and it should not save any information in the cache (including "failed" thumbnails). Otherwise, thumbnailing will be prevented even if the permissions are changed to permit reading.

Concurrent Thumbnail Creation

An important goal of this paper is to enable programs to share their thumbnails. This includes the occurences of concurrent accesses to the cache by different programs. Problems arise if two programs try to create a thumbnail for the same file at the same time. Because of this the following procedure is suggested:

  1. Check if the thumbnail already exists and if it's valid.

  2. If the above conditions are not fulfiled create the thumbnail and write it under a temporary filename onto the disk.

  3. Rename the temporary file to the thumbnail filename. Since this is an atomic operation the new thumbnail is either completely written or not.

This way the worst case is that a thumbnail will be written twice. However, the thumbnail is in a sane state at any time.

Note

The temporary file should be placed into the same directory as the final thumbnail, because then you are sure that they lay on the same filesystem. This guarantees a fast renaming of the temporary file. Using a combination of programname, process id and eg. first characters from the hash string should give a fairly unique temporary name.

Advantages Of This Approach

Previously versions of this standard used a very different mechanism for storing thumbnails. But this one has some very important advantages:

  1. Works for all kinds of possible file locations, since its based only on the textual URI representation of a file. This way files that are located on the locale filesystem or a samba, http, ftp or WebDAV server can be treated equally.

  2. It results in a flat directory hierarchy which assures fast access. Since the hash is always 32 characters long the thumbnail filename is exactly 36 characters long for every possible file (including the '.png' suffix).

  3. Due to the usage of the MD5 hash its unlikely that there occur clashes between two different thumbnails, even if it's theoretically possible. But the probability is very low and can be ignored in this context. The worst case would be that a thumbnail overwrites another valid one. Ok, if they have exactly the same modification time it is theoretically possible too that a wrong thumbnail for a file will be displayed (see Detect Modifications).

  4. It's very easy to implement.

Note

There do exist a lot of different library implementations for the MD5 hash algorithm. If you don't want to add yet another library dependency to support thumbnailing in your program you can eg. use the RFC 1321 implementation by L. Peter Deutsch. It adds only 1.5kb sourcecode in two files to your project and can be used without much restrictions.