All about DjVu files
DjVu (pronounced Déjà-Vu) is an image compression format developed specifically for scanned documents. Unlike PDF, which is optimised for text and vector graphics, DJVU stores compressed pixel images. This is ideal for scans of books, magazines or historical documents. The biggest advantage is that, compared to PDF, the file size is particularly small. DjVu was developed between 1996 and 2001 at AT&T Labs.
How does compression work with DjVu files?
DjVu separates each page into layers: foreground (text and lines), background (paper structure and images) and a mask. Each layer is compressed using its own algorithm optimised for this task:
The text is processed using the JB2 algorithm. This recognises recurring patterns. For example, the letter "a", if it appears repeatedly in the same font and size, is only stored once. Any further occurrences of this letter are then only stored as a reference. This saves a considerable amount of storage space. The background layer uses IW44, a wavelet-based algorithm similar to JPEG 2000. As a result, DjVu files can be 5-10 times smaller than PDF files of comparable quality.
History and distribution of DjVu
In the early 2000s, DjVu was considered a serious competitor to PDF. The Million Book Project, one of the largest digitisation projects worldwide, used DjVu as one of its output formats from 2002 onwards. University libraries, Wikisource and some scientific archives also relied on DjVu. The heyday ended around 2015 when browsers discontinued support for certain plugins and Java applets. In 2016, the Internet Archive finally announced that it would no longer create new DjVu files, as PDF had become the standard.
Despite the decline in newly created DjVu files, there are still millions of DjVu files on the internet. To be able to open these files directly and in high quality on all devices, it makes sense to convert them to PDF. The only disadvantage is the slightly larger file size after conversion. However, the slightly larger file size is usually acceptable today.
Sources
Archive.org: Discussion about the end of DjVu creation
Archive.org: Review of 20 years of the Million Book Project
Eldakar, Y., El Gammal, K., Adly, N. et al.: The Million Book Project at Bibliotheca Alexandrina. Journal of Zhejiang University-SCIENCE A 6(11), 1327–1340 (2005). https://doi.org/10.1631/jzus.2005.A1327
Convert, open and edit DjVu files
Details about DjVu files
- Software for opening DjVu files
- Software for editing DjVu files
- MIME-type for DjVu
No Comments