Technical Information

User Information

Pagination

The Sheet Music images in the Historic American Sheet Music Project were arranged according to the following layout:

  • The images were numbered according to the order in which they were scanned, not according to the page numbers appearing on the pages of a piece.
  • When multiple copies of the same piece were available which were perfectly identical, the best copy was scanned.
  • When multiple copies of the same piece were available in which only the illustrated title page was different, the two illustrated title pages were scanned as image 1 and 2, followed by the best copy of the rest of the pages of the piece.
  • When multiple copies of the same piece were available in which the advertisements appearing in the piece were different, the advertisements are presented sequentially within the numbering scheme as in the following example:
    • Image 1 - Title Page
    • Image 2 - Advertisement from piece 1
    • Image 3 - Advertisement from piece 2
    • Image 4-6 - Best copy of music
    • Image 7 - Advertisement from piece 1
    • Image 8 - Advertisement from piece 2
    • Image 9 - Back page of piece 1
    • Image 10 - Back page of piece 2
  • In some instances the copies were different enough that they both merited scanning, but were not different enough to designate them as different editions. In these cases, the first piece was scanned followed by the second. For instance, piece one may occupy images 1-5 and piece two images 5-10. The occurrence of this is rare.

Printing

Users are welcome to print images from the Historic American Sheet Music Project for personal or research use (see Copyright). Printing depends largely upon your system and the type of software and printer you have. Acceptable printing has been obtained printing directly from Mozilla or Internet Explorer, but saving the image and printing it from a graphics program is also an option.

Images and texts on these pages are intended for research or educational use only. Please read our Statement on use and reproduction for further information on how to receive permission to reproduce an item or how to cite it.

Sensitive Materials

This site includes historical materials that may contain offensive language or negative stereotypes reflecting the culture or language of a particular period or place. These items are presented as part of the historical record and do not reflect the values and beliefs of Duke University, the David M. Rubenstein Rare Book & Manuscript Library, or the Library of Congress/Ameritech, who provided funding for the project.

Selection, Indexing and Access

Selection of items

The items in this digital collection were selected from the cataloged and uncataloged material in the David M. Rubenstein Rare Book & Manuscript Library and represent a wide variety of musical styles. Because of the strength of Duke's holdings in Southern Americana, Confederate imprints in particular, all of the sheet music published in the South during the Civil War was selected. Only those significantly imperfect items (e.g., pages missing) were omitted. Even then, if the Duke item is unique, it was scanned.

A wide variety of types of music were selected, including bel canto, minstrel songs, protest songs, sentimental songs, patriotic and political songs, plantation songs, Civil War songs, spirituals, dance music, songs from vaudeville and musicals, "Tin pan alley" songs, and songs from World War I. Also included are piano music of marches, variations, opera excerpts, and dance music.

Every effort was made to select music that was representative of the time and genre. Among the many pieces for piano, for instance, there are simple dances for beginning piano students, duets of somewhat greater difficulty, and virtuoso works for the accomplished musician. There are a few songs with guitar, and a few with flute or violin accompaniment. There are even a few pieces for piano, 6 hands (imagine three children on a piano bench!). During the process of selecting items for digitization, we cooperated with Brown Univeristy, who also received a Library of Congress/Ameritech award for an African-American Sheet Music Digitizing Project. Although we have selected some similar editions, every attempt was made to avoid duplication. Thus the Duke collection includes only a small sample of dialect or plantation songs, which were the focus of the Brown project.

In addition to balancing the collection between vocal and instrumental, "classical" and "popular," entertainment for the home and public entertainment (e.g., musicals), we tried to maintain a balance of titles in each decade. Because of the strength of our Southern history collections, however, the Civil War period includes more than any other. This is the breakdown by decade:

  • 1850-1859: 288
  • 1860-1869: 1048
  • 1870-1879: 183
  • 1880-1889: 157
  • 1890-1899: 228
  • 1900-1909: 478
  • 1910-1920: 660
  • Total: 3042

Indexing policies

Each title was indexed from the original item using a template. Some subject headings were assigned from the Library of Congress subject headings, but more emphasis was placed on assignment of headings developed for this digital collection. Some comparison of composers, authors, etc. names was made with the Library of Congress authority file. Few of the names actually appear in the authority file, and sometimes a name would be established after many items had already been completed. In those cases, there was no attempt to systematically go through every heading after the indexing was completed.

Specialized subject access

Phrases designed for this project were used to bring similar material together. These terms are described in the Subject Terms section of the glossary. Each piece was assigned one or more phrases appropriate to the item.

Creation of the Images

The sheet music in the Historic American Sheet Music Project was scanned on UMAX Mirage II and Mirage IIse 11x17" flatbed scanners. These were connected to Power Macintosh 7300/200 workstations running Mac OS8 and Adobe Photoshop 4.0. Over the course of two semesters, Duke students working in the Digital Scriptorium scanned over 16,500 images. These master images were created at 150 dpi in 24-bit RGB color and saved in JPEG format. Testing indicated that the 150 dpi color scans provided great enough resolution for 1 mm characters to be adequately visible on both existing computer monitors and laser-quality prints.

Each master image was placed through a quality control process, checked for image quality, pagination, page orientation, amount of skew, cropping, color, and other problems which arose. The biggest scanning problem encountered was the prevalence of Moiré patterns caused by halftone dots in the page being scanned. Most often these appear on the illustrated title pages, but are frequently found elsewhere in the pieces as well. A variety of techniques were devised to deal with this issue. The descreening feature found in the image capture software being used frequently corrected the problem, while in more difficult cases slight application of Photoshop's blur filter was employed.

Programming to automatically create 72 dpi images and thumbnails from the original 150 dpi scans was conceived and developed using the PerlImageMagick, a freely available UNIX graphics package. The conversion consisted of several steps. First, all 16,596 of the 150 dpi images were transferred to the Scriptorium machine (a dual-processor Sun Sparc 20) by FTP and then arranged according to the directory structure scheme devised at the beginning of the project. This scheme allows for quick server access and ease of file management by creating a tree-like structure in which each branch may contain no more than 100 subdirectories. scripting language and

During the scanning phase, the 150 dpi images had been simply identified by a unique identifier based upon the call number followed by the image number. All the files were renamed from their working names to a regular and easily identifiable file naming system based on the call number of the piece of music, followed by the image number, followed by the size of the image - expressed as "150dpi" in this step. The unique identifier serves as the key which holds the database records and the images together.

Finally, by taking advantage of the ability of the Scriptorium machine to run multiple processes and employing another machine running Linux, multiple Perl conversion scripts were run both by day and night allowing the generation of 22,680 additional images in the period of approximately a week. These included 16,596 72 dpi images, 3,042 "small" images, and 3,042 thumbnails. It was decided that both a "small" image measuring 300 pixels in width and a thumbnail measuring 100 pixels in width would be produced for each of the illustrated title pages of the sheet music. The small image is embedded within the database record, and the thumbnail serves to maintain context while browsing through a piece.

As a page-turning mechanism, wrappers for each piece were created in HTML by a Perl script. The wrappers supply a table of contents listing each page and a method of viewing both 72 and 150 dpi image sizes while maintaining the context and pagination of the piece.

The sheet music database is comprised of a total of 39,276 individual images and utilizes 20.48 gigabytes of disk space including HTML wrappers. Individual 150 dpi images have an average file size of 0.99 megabytes, and the average number of pages/images per piece is 5.46. A 37.8 gigabyte RAID disk array was added to the Digital Scriptorium's Sun Solaris Internet server and the machine has been upgraded with additional enhancements designed to speed access to it's digital resources.

Database Format

The flat-file database which contained the indexing information for the pieces was converted to SGML format in the form of the Encoded Archival Description (EAD) Version 1.0 DTD using the Microsoft Word macro language and Perl. Each field in the database was mapped to an EAD element which was made unique by use of attributes.

The resulting SGML database is presented in HTML for ordinary web browsers using DynaWeb software from INSO. This method allows for searches to be limited to the unique fields, which allows highly targeted searching. Access to the database is through both targeted searching and "canned searches" on the Subject Content, Illustration Type, and Advertising subject fields. In addition a user may perform a targeted or keyword search within the DynaWeb interface.

Credits

Members of the Digital Scriptorium Historic American Sheet Music Project team:

Stephen D. Miller - Historic American Sheet Music Project Manager
Overall project management including workflow and scanning, interface design and Web site creation, EAD Encoding, Perl scripting and image conversion, and quality control of images and data.
Lois Schultz - Music Cataloger and Subject Expert
Music selection, indexing, author of About Sheet Music section and Selection and Indexing information and other pages.
Lynn Pritcher - Project Manager for the Ad*Access project
Assisted with the Historic American Sheet Music Project
Steve Hensen - Historic American Sheet Music Project Director
 
Paolo Mangiafico - Director of the Digital Scriptorium
 

Special thanks to all the Duke University students who worked on the Historic American Sheet Music Project:

  • Kirsten Braaten
  • Margaux Butler
  • William Clarkson
  • Erin Graham
  • James Harkins
  • Anjali Harsh
  • Jamie Kelley
  • Cat Saleeby
  • Brad Siegele
  • James Sizemore
  • Heather Swagart
  • Michael White
  • Josh Wilson
blog comments powered by Disqus