Computing message digests in the browser

Use cases for message digests

Hashes, or message digests, can be quite useful even outside their primary field of use, which is cryptography. They have the property of quickly identifying any amount of data by a fairly short number, which is usually displayed in hexadecimal. One use case is to quickly check if the copy of a file matches the original, which is why many large downloads such a Linux ISOs are accompanied by their message digests. All modern operating systems can calculate the message digests for at least some algorithms, but not all make this easily accessible for the end user.

Computing it in the browser

The idea for the File-Hasher page was to make this functionality accessible on any machine with a modern browser. Partially this is because it can actually be useful, but also because this provided a chance to try some modern capabilities available in web browsers these days. One simple implementation would simply upload a file to a server, have the server compute the message digest and send it back to the browser. But this would limit the size of files which can be checked and make things fairly slow (in most cases). Of course, modern Javascript already has an API to compute the message digests, SubtleCrypto.digest(). Which is really nice, and would cover the main use cases here.

But it's not quite there yet. It is limited to the SHA-family of algorithms. This would have been fine, but it's nice to be able to include others. A more severe limitation is the API, which requires all data to be passed in one call. This is fine for small to medium sized files, but limits file sizes to whatever the browser is capable (and willing) to handle.

Using WebAssembly

By using the Javascript File API, which allows reading a file in smaller chunks, and then passing each chunk to some WebAssembly based code which can perform the calculation one chunk after the other, it is possibly to have this work on pretty much files of any size and using whatever message digest one can find an implementation for. So this is what the File-Hasher does. There is some very simple Rust-based code using external libraries to compute the message digest. This code is translated to WebAssembly and loaded by the Javascript code on the File-Hasher page. Now whenever the user selects a file, the contents of that file is read and continously passed to the WebAssembly part. When the file was read, the message digests are computed and displayed (again in Javascript).