Creation of the Images
The process for the creation of images in Ad*Access Project and the Historic American Sheet Music Project were practically identical. The advertisements in the Ad*Access Project were scanned on UMAX Mirage II and Mirage IIse (11x17") and Agfa Arcus II (8x14") flatbed scanners . These were connected to Power Macintosh 7300/200 workstations running Mac OS8 and Adobe Photoshop 4.0. From September 1997 to August 1998, Duke students working in the Digital Scriptorium scanned over 7,000 images. These master images were created at 150 dpi in 24-bit RGB color and saved in JPEG format. Testing indicated that the 150 dpi color scans provided great enough resolution for 1mm characters to be adequately visible on both existing computer monitors and laser-quality prints.
Each master image was passed through a quality control process, checked for image quality, amount of skew, page orientation, cropping, color, and other problems which arose. The prevalence of Moiré patterns caused by halftone dots was the primary scanning problem. Newspaper advertisements were the least difficult to scan. The magazine advertisements presented additional difficulties. Variations in paper texture, illustration type (color drawing, photographs, etc.), more than one type of illustration type in an advertisement, and overlapping of illustration and text impacted the presence of Moiré patterns differently. A variety of techniques were devised to deal with this issue. The descreening feature found in the image capture software corrected many problems alone, but for the more difficult advertisements it was necesssary to adjust the levels, and/or blurring or sharpening of portions or the entire advertisement was required.
Programming to automatically create 72 dpi images and thumbnails from the original 150 dpi scans was conceived and developed using the PerlImageMagick , a freely available UNIX graphics package. The conversion consisted of several steps. First, all 7,307 of the 150 dpi images were transferred to the Scriptorium machine (a dual-processor Sun Sparc 20) by FTP and then arranged according to the directory structure scheme devised at the beginning of the project (see below). This scheme allows for quick server access and ease of file management by creating a tree-like structure in which each branch may contain no more than 100 subdirectories. scripting language and
During the scanning and database entry phase, each advertisement was identified by a unique identifier based upon the number pencilled on the advertisement during the selection phase of the project. These numbers were based on advertisement category (e.g. all Television advertisements begin with the letter T) and alphabetical/chronological placement in the selection of advertisements for the project. All the image files were renamed from their working names to a regular and easily identifiable file naming system based on the advertisement's identifier (T0017), followed by the size of the image - expressed as "150dpi" in this step (T0017-150dpi.jpeg). The few advertisements with multiple pages required more than one image to capture the total advertisement. These advertisements were identified with an image (page) number in addition to the advertisement's unique identifier (T0017-3-150dpi.jpeg). The unique identifier serves as the connecting link holding the database records and the images together.
Finally, by taking advantage of the ability of the Scriptorium machine to run multiple processes and employing another machine running Linux, multiple Perl conversion scripts were run both day and night allowing the generation of additional images. These included 7,307 of 72 dpi images, "small" images, and thumbnails. It was decided that both a "small" image measuring 300 pixels in width and a thumbnail measuring 100 pixels in width would be produced for each advertisement (and every page of advertisement if the ad required this). The small image is embedded within the database record, and any advertisement with multiple pages has all these small images on one HTML page containing images and database information.
The Ad*Access database is comprised of a total of approximately 21,980 individual images and utilizes 6 gigabytes of disk space. Individual 150 dpi images have an average file size of 795 kilobytes.
Back to top
The database which contained the information for the advertisements was converted from FoxPro 2.6a for Power MacIntosh to SGML format in the EAD Version 1.0 DTD. Each database field was mapped to an EAD element made unique by use of attributes.
The resulting SGML database is presented in HTML for ordinary web browsers using DynaWeb software from INSO. This method allows for searches to be limited to the unique fields, which allows highly targeted searching. Access to the database is through both these targeted searches and "canned searches" on the category field, and selected subcategory fields for Radio and Television (See Browse Ad*Access). In addition a user may perform a targeted or keyword search within the DynaWeb interface.
Back to top
For training of students on the large UMAX scanners, the Historic American Sheet Music Scanning Procedures Page was employed with slight alteration in the area of descreening. Variety of paper textures and illustration types in the advertisements required the students to use a wider range of descreening options, as noted below. For the Agfa Arcus small scanners, the following information was given.
- Always clean scanner. Check for dust or brittle paper flakes on scanner bed - clean using compressed dusting gas .
- Check for smudges or fingerprints - clean by spraying glass cleaner onto cleaning pad and wiping glass. Spray glass cleaner ONLY onto the pad - spraying cleaner on the machine itself will allow the cleaner to get into the scanner and will create problems with the mechanics and electronics. Use ONLY the cleaning pads to wipe the glass, thus avoiding scratching or marking the glass.
- Determine if the page is a halftone image which will need descreening. For advertisements with color images from magazines, "Art Magazine" should be chosen. For advertisements from newspapers, "None" should be chosen. For magazine advertisements with black/white photos, scanning with a descreen of 120 or 150 is advised. For difficult cases, go up as far as 210. If after three or four scanning attempts the advertisement still does not look correct, contact Lynn.
- Handle the advertisements carefully, keeping in mind preservation concerns for the material. Place original on the scanner glass face down in the left lower corner, with the top of the image closest to you. If the advertisement does not fit on the small screen, let Lynn know. This will be shifted to be scanned on the large scanner.
- Every ad will have a black background. It is very important that the background is used at all times. This willcreate continuity and provide extra space needed in cropping.
- ALWAYS check the settings on the scanner when you come into work. They should appear as noted below.
- Open the Fotolook helper application in Photoshop under File/Import/Fotolook. *There is no way to cancel the scan process once it has begun. The scanning process takes from 3-5 minutes, depending on the item. Therefore, the following steps are extremely important for the successful and timely scanning of an item.
- Preview the item to ensure that the ad will be completely scanned. Use the mouse to frame the ad. Do not cut into any of the words or pictures of the ad. Leave a bit of the black background within the area to be scanned. The default size for the scanner bed area will be Max. Area. DO NOT readjust.
- Rotate turned images using the Rotatebutton at top/right of screen.
- The Fotolook screen should have these settings:
Mode Color RGB Original Reflective Input 150 ppi Scale To 100% Range Automatic Tone Curve None Descreen Art Magazine OR None (see * below) Sharpness None ColorLink None Optimize Quality Preferences General Settings Last Used OR Current (see ** below)
* Descreen: For halftone images set to "Art Print 175lpi." If necessary set descreen to "Custom" and try 210 dpi. For line images set to "No Descreen." For advertisements with black and white photographs, set to "Custom" then choose 120 dpi or 150 dpi. ** Settings: "Last Used" or "Current" are the only settings that should appear in this portion of the screen. Do not choose "Default" - it changes all settings back to forms not applicable for this project. DO NOT change any other software settings.
Editing and Saving:
- Check for moiré patterns and unacceptable skew in the scanned image and re-scan if necessary.
- Sharpen or Blur the image using Filter/Sharpen or Filter/Blur. After the first use of the filter, its listing will appear at the head of the Filter. Fading of these filters is possible after initially sharpening or blurring of the image has taken place.
- If additional cropping is necessary, select an area to be cropped using the square selection tool. Crop using Image/Crop.
- To save: Choose File/Save As.
- There will be a folder with each person's name on it. Choose this folder, open it, then name the file the same name as the file name, using this template:
- Double check the file name with the filename on the advertisement.
Back to top
Selection of items:
The over 7,000 advertisements included in the Ad*Access project were selected from the Pre-1955 Competitive Advertisements file of the J. Walter Thompson Company Archives. This file is one of the largest and most heavily used in the JWT Archives. It is also oneof the most physically fragile parts of the Archives. Covering categories ranging from advertising agencies to transportation, the Competitive Ad file consists of an estimated 100,000 newspaper and magazine advertisement tear sheets for a wide array of products and services. The collection was maintained in the New York office of the agency as a reference file for JWT staff.
Ad*Access consists of advertisements created between the years 1911 - 1955 selected from five subject categories of broad popular appeal and demonstrated interest to researchers. The advertisement campaigns and examples included in this project provide a colorful snapshot of American business and society during the first half of this century. The forty-plus years of advertising surveyed in this project reflect the burgeoning advertising industry and its impact on the American consumer.
Staff from the John W. Hartman Center for Sales, Advertising & Marketing History pre-selected the five categories (Beauty and Hygiene, Radio, Television, Transportation, and World War II). Staff members jointly determined the selection criteria and trained several student workers to carry out the process of choosing ads for inclusion. Criteria for each of the categories were written and posted at the students' work stations. The Beauty and Hygiene category and the Transportation category each contain so many ads that stricter limits were placed on selection, to ensure completion of the project.
The Digital Scriptorium is equipped with four scanners, two small (8 x 14 inch) and two large (11 x 17 inch). Only ads small enough to fit the scanners were included in the on-line project. Many ads exceeded the dimensions of the largest scanners, and were categorized as "oversize." These advertisements were placed in acid free folders and boxes for preservation and onsite access but were not scanned. An experiment creating slides from the oversize ads and then scanning the slides showed that consistent quality and dpi levels for scanning could not be maintained. Therefore, no "oversize" advertisement images (e.g. full-size newspaper pages) have been included in the project.
Staff and students did side-by-side comparisons of ads to identify duplicates and near-duplicate examples. Near-duplicates (e.g. ads with the same image but different text, ads with the same text but different images, etc.) were included primarily in the Radio and Television categories. Exact duplicates of ads were removed before scanning (when recognized). Users should come upon very few exact duplicates in the Ad*Access database, as an effort has been made to eliminate them.
Many newspaper advertisements appear yellowed, or tanned, as compared to current newspapers; rips, tears, and incomplete ads (e.g. pieces torn out) are apparent in a small number of both magazine and newspaper advertisements. Ads were excluded if they were badly damaged or were too dark to scan.
The Transportation and Beauty & Hygiene categories, due to the extremely large number of items in the original collection, had additional limiting guidelines applied:
- colorful ads were chosen over similar black/white advertisements as much as possible for their greater visual appeal.
- "firsts" in the industry were included
- few damaged ads were included
- only selected advertisements from long-running campaigns were included; and some small campaigns judged to be of lower interest or significance were excluded.
- advertisements were included from well-recognized companies and also from companies which were represented by a large number of advertisements, even though not currently well known.
- in the Transportation category, brochures, correspondence and ads from travel agencies were excluded. Ads for military planes and cargo shipments also were excluded.
Back to top
Total Numbers of Advertisements
The number of original ads available in each of the five main Ad*Access categories varied significantly. Therefore, the total numbers per category in Ad*Access vary considerably. Transportation and Beauty & Hygiene are the two largest categories; even with stricter selection guidelines they make up over two-thirds of the total content of the project (over 5,000 of 7,307 ads). The main categories break down as follows:
|Beauty and Hygiene||2,391|
|World War II||397|
The greatest numbers of ads are from the 1940s and the 1950s, with 3,750 and 2,146 ads respectively. There are only a few dozen ads from the 1910s, and all of them fall in the Beauty & Hygiene category. It is unfortunate that the numbers for that early decade are so low, but the nature of the categories included in the project is one factor: the radio, television, and World War II categories obviously would not be represented in the 1910s. The lack of transportation ads for the early years just reflects the nature of the items that survive in the Competitive Advertisements Collection from which this project is drawn.
Back to top
A forty-two field database, created specifically for the Ad*Access project and based on access needs, was filled in for each advertisement included in the digital collection. The information was extracted from the advertisement itself. A standard thesaurus was not applied at the time of data entry as products, people, company names, etc. varied from advertisement to advertisement. Company names indexed were based on the "official" name of the company (Yardley of London, Inc. rather than Yardley) at the time of the advertisement, so company names may be seen to shift for the same product over the decades as a company expanded, merged, or was subsumed into another company. Every advertisement has a database entry, but not all fields were completed for every advertisement. If a field was not applicable, it was simply left blank.