How do I use scanned documents in Perusall?

  • Updated

Most PDF documents have an embedded "text layer" that tells Perusall where the text is on the page and what its contents are. This text layer enables students to highlight text directly on the page, and it also enables Perusall's built-in read-aloud feature to work. Having a text layer is essential for accessibility, as it also enables screen readers to access the text within the document.

If you upload a PDF document to your Library that does not contain any embedded text layer, you will see an accessibility warning when the document properties are visible in the Library. When you receive this warning, we recommend removing the PDF from your library, running the PDF through an OCR (Optical Character Recognition) application to add a text layer, and then re-uploading the PDF to Perusall. You can also optionally enable Perusall's built-in automatic rotation for all uploaded documents within Settings > Advanced (note this will make PDF-based uploads slower as the OCR process is completed for each page).

To ensure the cleanest possible scan, Rebecca Oling, Director of Digital Accessibility at SUNY Purchase shares the following tips:

  • Scan from a clean print copy (avoid skewing, markings, highlighting, shaded text, black gutters, etc).
  • If there is bleed through from thin pages (you see dots from text on verso page), try placing a piece of black paper behind the page being scanned and scan as color.
  • Scan only one page at a time if possible, but press hard on the spine if not.
  • Be sure the reading order of the document is clear (that paragraphs aren’t located in an unusual way).
  • Take a moment to crop out shadows or other artifacts so that the page is as clean as possible.
  • Set the scanner to scan at higher quality (300 dpi or above).

Related to

Share this article

Was this article helpful?

6 out of 9 found this helpful