Campus Publications Specifications (from 2015)

Table of Contents

1. Collection Record

Create a collection record in the Library Digital Repository (LDR) for the Campus Publications digital collection. Do this only once. (SCRC)

Done. Campus Publications Digital Collection

2. Titles

Patterns are establishd by SCRC for each title.

3. Dates

For an annual, the pattern is YYYY. For a monthly, it is YYYY-MM. For something the periodicity of which is less than monthly, the pattern is YYYY-MM-DD. Academic quarters should be recorded using the following as a pattern.

2016-01/2016-03 (for Winter 2016)
2016-04/2016-06 (for Spring 2016)
2016-07/2016-09 (for Summer 2016)
2016-10/2016-12 (for Autumn 2016)

Use the above as a basis for constructing other patterns, e.g.,

1960-07/1960-12 (Summer/Fall 1960)
1972-12/1973-03 (Winter 1972/1973)

For the first we take the beginning of summer and the end of autumn. For the second we record the month the quarter began that term.

Dates recorded using the above as a pattern will be sorted correctly. Alternative display dates (if desired) can be generated from the underlying ISO 8601 specification.

Note: There are several codes to explain different ways in which a required element's value may go missing.1

(:unac) temporarily inaccessible
(:unal) unallowed, suppressed intentionally
(:unap) not applicable, makes no sense
(:unas) value unassigned (e.g., Untitled)
(:unav) value unavailable, possibly unknown
(:unkn) known to be unknown (e.g., Anonymous, Inconnue)
(:none) never had a value, never will
(:null) explicitly and meaningfully empty
(:tba) to be assigned or announced later
(:etal) too numerous to list (et alia).
(:at) the real value is at the given URL or identifier.

4. Numbering

Assign a four-digit mvol number for each title, e.g., 0001. These must be unique.

Campus Publications uses the following numbering scheme:

  • mvol-number-volume-issue
  • mvol-number-year-MMDD

"mvol" is always "mvol". "number" is a four-digit number for the title left-padded with 0s, e.g., 0001. "volume" and "year" are both four-digit numbers, left-padded with 0s as necessary in the case of a volume. "issue" and "MMDD" are four-digit numbers. Where a volume is an issue, or where a year includes all twelve months, "issue" and "month" are 0000. Examples of these patterns appear below.

The site will initially be populated with the following titles:

Title mvol    
Cap and Gown mvol-0001    
University of Chicago Magazine mvol-0002    
Daily Maroon mvol-0004    
Quarterly Calendar mvol-0005    
Annual Register mvol-0006    
University Record mvol-0007 mvol-0445 (New Series) mvol-0446 (3rd Series)

The site will also be populated with titles from this list.

5. Descriptions

Descriptions are established by a "describer", that is, either a cataloger or an archivist depending on the material.

6. Title Spreadsheet

For each title, prepare a spreadsheet with four columns:

Title Date Identifier Description

For example:

Title Date Identifier Description
University of Chicago Magazine 1908-10 mvol-0002-0001-0001 The alumni magazine of the University of Chicago.

Each row represents a volume or an issue of a title. In some cases, volumes or issues may be missing, or numbering may vary. Missing volumes or issues are not recorded on the spreadsheet, and variations in numbering, etc., are noted in the Notes field. This allows a script to be written that checks to see if all known volumes and issues are present in the directory (folder), and allows the automatic generation of a .dc.xml (see below). Here is an example of the kinds of problems a spreadsheet may be used to indicate: "For Cap & Gown, the spreadsheet represents our holdings. We are missing two volumes- 1959/v52 and 1993/v74. Any other gaps in the date column are for years when there was no yearbook. There should not be any gaps in the volume +numbers other than 52 and 74." Here is a sample spreadsheet for the University of Chicago Magazine.

Titles, Dates and Numbering (Identifier) follow the specifications given above.

7. Directory (folder) and file structure

The following patterns are possible:

mvol-number-volume-issue
mvol-number-year-MMDD

These will be described in the sections below.

7.1. mvol-number-volume-issue

Arrange issues according to the following hierarchy:

mvol/ -- all mvol titles
mvol/0001 -- individual mvol title
mvol/0001/0015 -- individual volume of that title
mvol/0001/0015/0008 -- individual issue of that volume

Note: if a volume has no issues, use the same hierarchy as above, but use 0000 for the issue, e.g.,

mvol/0001/0062/0000

Associated files follow this pattern:

mvol/0001/0036/0000/mvol-0001-0036-0000.pdf

NOTE: There are some anomalous patterns in Campus Publications.

Some early issues include a letter as part of the issue number, e.g., mvol/0007/0001/034A, mvol/0007/0008/011C, etc. In these cases, use an upper case letter, as indicated.

In at least one case, an issue has been split into two parts upon publication. In these cases, precede the issue number with an upper case "A" for part 1, "B" for part 2, etc., keeping the issue component of the pattern four characters long. For example:

mvol/0002/0033/B009 corresponds to part 2 of issue 9. (Issue 8 will be mvol/0002/0033/0008.)

7.2. mvol-number-year-MMDD

Arrange issues according to the following hierarchy:

mvol
mvol/0004
mvol/0004/1910
mvol/0004/1910/0104

Associated files follow this pattern:

mvol/0004/1910/0104/mvol-0004-1910-0104.pdf
mvol/0004/1910/0104/mvol-0004-1910-0105.pdf
mvol/0004/1910/0104/mvol-0004-1910-0106.pdf
mvol/0004/1910/0104/mvol-0004-1910-0107.pdf

The preceding example represents the Daily Maroon for January 4th-7th, 1910.

8. Files and Naming Conventions

The following is an example of the directory (folder) and file structure for an issue.

mvol/0007/0012/0001
ALTO                            mvol-0007-0012-0001.mets.xml
JPEG                            mvol-0007-0012-0001.pdf
TIFF                            mvol-0007-0012-0001.struct.txt
mvol-0007-0012-0001.dc.xml      mvol-0007-0012-0001.txt

8.1. ALTO

The ALTO directory contains XML files containing OCR output with position data for each page of an issue.

mvol-0007-0012-0001_0001.xml
mvol-0007-0012-0001_0002.xml
mvol-0007-0012-0001_0003.xml
mvol-0007-0012-0001_0004.xml
mvol-0007-0012-0001_0005.xml
mvol-0007-0012-0001_0006.xml
mvol-0007-0012-0001_0007.xml
mvol-0007-0012-0001_0008.xml
mvol-0007-0012-0001_0009.xml
mvol-0007-0012-0001_0010.xml
[etc.]

8.2. JPEG

The JPEG directory contains a JPEG derivative image for each page of an issue.

mvol-0007-0012-0001_0001.jpg
mvol-0007-0012-0001_0002.jpg
mvol-0007-0012-0001_0003.jpg
mvol-0007-0012-0001_0004.jpg
mvol-0007-0012-0001_0005.jpg
mvol-0007-0012-0001_0006.jpg
mvol-0007-0012-0001_0007.jpg
mvol-0007-0012-0001_0008.jpg
mvol-0007-0012-0001_0009.jpg
mvol-0007-0012-0001_0010.jpg
[etc.]

The numbering of objects is sequential, beginning with 0000_0001. The string consists of 8 numerals separated into two groups of four by an underscore character, left-padded with 0s as necessary to fill out the length of the string.

8.3. TIFF

The TIFF directory contains a digital masterfile for each page of an issue.

mvol-0007-0012-0001_0001.tif
mvol-0007-0012-0001_0002.tif
mvol-0007-0012-0001_0003.tif
mvol-0007-0012-0001_0004.tif
mvol-0007-0012-0001_0005.tif
mvol-0007-0012-0001_0006.tif
mvol-0007-0012-0001_0007.tif
mvol-0007-0012-0001_0008.tif
mvol-0007-0012-0001_0009.tif
mvol-0007-0012-0001_0010.tif
[etc.]

8.4. .dc.xml

The .dc.xml file consists of descriptive metadata for an issue: title, date, identifier, description. Here is an example:

<?xml version="1.0"?>
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:title>University Record</dc:title>
<dc:date>1907-07</dc:date>
<dc:identifier>mvol-0007-0012-0001</dc:identifier>
<dc:description>Official reports, addresses, actions of Ruling Bodies,
notices of campus events, and activities of faculty.</dc:description>
</metadata>

The basename of the .dc.xml file, in this example, mvol-0007-0012-0001, MUST conform to the identifier field in the .dc.xml record. The title and description are defined by the Special Collections Research Center (SCRC).

Dates MUST conform to the IS0 8601 specification. See above under Dates.

.dc.xml files MAY be generated automatically from information in the Title Spreadsheet; see above.

8.5. .mets.xml

The .mets.xml file is a METS file produced by the LIMB software. It includes technical metadata for the digital masterfiles (TIFF images) comprising the pages of an issue.

8.6. .pdf

The PDF file, in this example, mvol-0007-0012-0001.pdf, contains page images plus OCR for an issue.

8.7. .struct.txt

The .struct.txt file consists of structural metadata for an issue: object, page, milestone. Here is an example:

object          page    milestone
00000001                cover
00000002
00000003
00000004
00000005        7
00000006        8
00000007        9
00000008        10
00000009
00000010
00000011        13
00000012        14
00000013        15
00000014        16
00000015        17
00000016        18
00000017        19
00000018        20
00000019        21
00000020        22

Note: The preceding is an abbreviated form of the following.

collection document  object      page component milestone 
mvol       0005      00000001         1         cover
mvol       0005      00000002         1          
mvol       0005      00000003         1          
mvol       0005      00000004         1         
mvol       0005      00000005    7    1          
mvol       0005      00000006    8    1          
mvol       0005      00000007    9    1          
[etc.]

8.8. .txt

The mvol-0007-0012-0001.txt file contains raw OCR for an issue with no position data.

9. Interface

Footnotes:

Date: 2017-11-30; 2024-02-28

Author: Charles Blair

Created: 2024-02-28 Wed 11:47