This is the fourth in a series of articles about accessibility and office documents. In this article we’ll discuss more about the usage of Adobe Acrobat Portable Document Format (PDF) files and methods for ensuring they are accessible to people with disabilities. In future articles we’ll discuss web site development, and web applications and accessibility.
As we noted in the last article, the folks at Adobe have come a long way in making new Portable Document Format (PDF) files accessible. However one of the more vexing issues is what to do about older PDFs that were made several years ago, particularly those which were made by scanning. In the first part of this article we will address the ways to ensure legacy PDFs meet accessibility standards and in the second part we will discuss a fairly new type and use of PDFs, PDF Forms.
At some point in the evolution of Adobe Acrobat, the process for making PDFs became more complex and the Adobe Reader became more robust. But many of the earliest PDF files are simply images of the original document. If this is the case, you may simply have to re-type the original document or scan it using a Optical Character Recognition (OCR) program and then edit. There are several ODRs on the market including OmniPage (www.omnipage.com) and ReadIris (www.irislink.com) which both claim to be able to "read"and decipher any PDF. I will give you fair warning that even with direct conversions from these programs - that is, importing the PDF file as opposed to using a scanner - the output will still only be about 90-95% correct and will require some editing. This percentage goes down when the original document was created with a desk top publishing program like PageMaker and sports lots of columns, images and different fonts.
You may also be able to use Adobe Acrobat Professional to scan or decipher old PDFs. Acrobat Professional supports the ability to highlight text in a PDF file and copy and paste it into another document (see directions below). The process will even offer you the option of copying the content in a Table or open a spreadsheet and deposit the content there if Acrobat senses the content is in tabular format. However, my experience with successfully exporting and importing tabular data using this method proved to be not without considerable errors. Extraction of plain text using the Copy method (including using the plain Copy method with the tabled data) generally resulted in more successful outcomes.
Regardless of the method used to extract content from an old PDF file, you should expect a fair amount of editing to be necessary after the extraction is made.
Here are the step-by-step directions for copying text from a legacy PDF using Adobe Acrobat Professional:
Note that in some cases the copied and pasted content will have been stripped of all presentational elements (bolds, italics and fonts) and formatting. Be prepared to edit your pasted content.
By the way, if you are in a bit of a hurry, you can copy the whole document by clicking on Edit>Copy File to Clipboard and then pasting the content into your destination document. Be forewarned that this method will result in a document that will also require a fair amount of editing.
It should be noted that these methods cannot be used if the original PDF document has been "secured"by the original author. In this case, the only method for copying the content into a new document is to scan a printed copy of the document and use an OCR to extract the text.
PDF Forms provide an efficient means of collecting information and data. Creation of PDF forms requires the use of another application called Adobe LiveCycle Designer (LCD) which is bundled with Adobe Acrobat Professional 8. LiveCycle Designer provides the option of building a form "from scratch"or using one of the many templates that comes with the program.
Building a new form from the LCD template collection is fairly easy to accomplish. The application provides a wide variety of commonly used office forms from purchase orders to help desk requests.
When you create your first form, LCD asks you to complete several questions which identify your company/organization and even allows you to upload a company logo. The application "remembers"this information so that the next time you create a form using LCD, this information will already be included.
Customizing the template is also fairly easy as "objects"can be moved and dropped anywhere on the screen. Objects include everything from input boxes, to radio buttons.
When you first load up the application and choose the option of creating a new document from a template, one of the choices will be the Form Return option which adds a "Print Form"and or "Submit by E-mail"button to the form. If included, the Print Form button literally places a clickable button at the top of the PDF form that will allow the user to enter information directly into the form in Acrobat Reader and print out a copy of their completed form by simply pressing this button (Note: the image of the buttons will appear on the printed output).
The Submit by E-mail button system works in a slightly different way. Once the Submit by E-mail button is pressed, the data entered onto the PDF form is converted into an Extendable Markup Language (XML) file and attached to a blank e-mail message. You may edit the e-mail message at this time; the message is not sent until you boot up your e-mail program. The receiver of the message containing the XML file will need to have a database set up capable of reading and parsing XML files in order to access the information from the form.
Building a new form using LCD from "scratch" is a bit more complicated. Once again, the application prompts you to include the Form Return information and the opportunity to add a Print button or E-mail button. Once this prompt has been completed, the user is presented with a blank form on the left side of the screen and a "Library"of commonly used objects in a right panel that may be dragged and dropped into the new form. After the Object is dragged onto the blank form, the user must then use the Object panel to program behavior of that object. Those familiar with object-based application development will find this fairly easy to figure out. Novice users will find that there is a learning curve involved to learn how to program each of the objects by reading the various help screens that are available through the application.
I had hoped that the template forms used by LCD would produce PDF forms that meet all accessibility requirements. I was disappointed to discover this was not the case. In the test files created in preparation of this article, the newly created PDF forms failed the Accessibility Checker test (in Adobe Acrobat Professional) for two reasons: lack of ALT text for the images, and lack of document language information. More frustrating was the fact that to add the ALT text to the image required re-opening the form in LCD. The step-by-step directions for adding the ALT text to image in LCD are provided below.
Most frustrating was the determination that there was no way to get the change the document language specification. Attempts at modifying the Default Form Locale to English (USA) in the Form Properties screen did not change the outcome. And attempts to find a solution to this dilemma using the built-in help screens of the Adobe Customer Support website were fruitless. While document language is not a critical accessibility issue, it may matter in some situations. So caution is advisable when using PFD Forms.
The step-by-step directions for adding ALT text to images in LCD are as follows:
As noted in this and in other articles in this series, the accessibility of PDF documents has improved in recent years. However, their use in communicating with and engaging the general public should still be used cautiously. In keeping with standards in the field of accessibility, when dealing with the public, it is usually a wise idea to provide digital documents in several formats to ensure universal access.
I have provided examples of PDF forms and some screen captures of showing some techniques on the accessible documents support page on the Maine CITE website.
Maine CITE provides additional resources that can help you with your goal of creating accessible documents. http://www.mainecite.org/awd/accdocs.html
John Brandt is a web designer and consultant who works with the Maine CITE Program in the area of accessibility and universal design. He may be reached at firstname.lastname@example.org
Return to Accessible Documents page
Return to Maine CITE