PDFs are evil.Probably the biggest annoyance in developing government web sites has been the overuse of the PDF format. But the problem extends beyond government sites. On the web are hordes of PDFs. I’ve been taught over the years that PDF is not optimal for delivering content over the Web, but I’ve recently discovered more information that backs this notion.

The expert’s opinion

On my walk to work this morning, I was listening to the first episode of the Web Design Podcast. In the podcast, Paul Boag recommended an article on A List Apart called Facts and Opinions About PDF Accessibility. In this article, Joe Clark most usefully lists situations where the use of PDF on web sites should be considered acceptable. If you start fishing around on sites that use PDF, you’ll find that the general usage of PDF does not fit in any of Clark’s scenarios.

Symptoms of the problem: overuse of PDFs

What are the business cases for having this widespread overuse of PDF? I can think of a few off the top of my head:

  • Managers tend to think that if a document is in PDF format, it is read-only and thus more “official.”
  • It saves time to put a link to a PDF on the web site. No need to draft up a separate HTML page or entry in the content management system.
  • Non-technical users can easily create PDF documents based on Word and Excel documents and hand them over to the technical users or the CMS for publication. This frees up time for technical users and gives the business users a better feeling of “control” over the appearance of the document.

The end result is a large mass of information that is not optimized for consumption in a web browser. Sure, there are time savings on the business side, but the compromise in this instance is the general user experience. The metaphor of a printed page is forced on a electronic medium that is not optimized for such.

The root of the problem: content management systems

I think overuse of PDF technology is a direct result of inadequate content management process and technology. In my February presentation for COMMUG, I quoted Victor Lombardi’s article about the problem of content management:

Content management systems suck.

The industry has a long way to go in this undertaking. I think content management systems are still at war with addressing the needs of both content contributors and providing a reasonable IT infrastructure.

Products like Adobe’s Contribute and Dreamweaver provide great workflows for content contributors and designers, but my experience with those products is that they are still in their infancy in regards to addressing efficient IT infrastructure needs. I think it is possible to use Contribute and Dreamweaver to meet data infrastructure needs, but it would require a large amount of manpower because there is no out-of-the-box solution. Adobe’s Contribute Publishing Server merely provides an API that allows integration between Contribute and custom applications, but you have to build that part yourself!

Most other large “enterprise” CMS products are backed by complex database systems that force much of that complexity on content contributors through a relatively awkward workflow. We need end user software that requires little training and makes content publishing a breeze.

There is a great need for a system that doesn’t make storing and managing the content on the back end a mess like Contribute and Dreamweaver do. But the need is also for this system’s interface to provide content contributors with a workflow that isn’t a nightmare to learn. Who will find that balance in their product? Who will make HTML content worth managing, thus eliminating PDF overload?