In this Twitter thread, Kristina Spurgin describes a pattern of errors in MARC records that I’ve been running into lately as well: a lowercase L in place of a one in dates, like “l905”. I’d also been assuming it was an artifact of OCR.

Her coworker explained that typewriters used to not have a one digit, and even after they did, people had strong muscle memory to type l (ell) instead of 1.

Now, if I can only figure out why this title started with “0il”…


Film imprint – Lights, camera, 261!


It turns out there is a field specifically for imprint statements for films, with dedicated subfields for producing company and releasing company.  And it goes waaaay back, to AACR1, pre-1976.

Before 1976, AACR apparently instructed catalogers to create an imprint statement for films that is totally different than for books. And that went into MARC 261. In addition to subfields for producing and releasing company, it also has one for “contractual producer,” a category I’m not sure I understand (since I haven’t yet gotten my hands on a pre-1976 copy of AACR). The order of the subfields is different than 260, as well – for example, the place of production, release, etc. is 261 $f, as opposed to 260 $a. Additionally, the subfields for date and place are repeatable, if there are multiple producing companies, releasing companies, and contractual producers recorded (which is a total nightmare for thinking about translating that data).  Here’s the actual field definition:

Keep reading

Yay! Thank you for this blog!


The safe care and handling of gases. (OCLC #6526758)

Slides! I’ve never gotten to catalog slides before. It was a nice opportunity to review the rules for this format. The set does have an accompanying audiocassette:

300 ǂa 37 slides : ǂb color ; ǂc 2 x 2 in. + 
    ǂe 1 audiocassette.

So there are two sets of Content/Media/Carrier fields:

336 __ ǂa still image ǂb sti ǂ2 rdacontent
337 __ ǂa projected ǂb g ǂ2 rdamedia
338 __ ǂa slide ǂb gs ǂ2 rdacarrier
336 __ ǂa spoken word ǂb spw ǂ2 rdacontent ǂ3 accompanying material
337 __ ǂa audio ǂb s ǂ2 rdamedia ǂ3 accompanying material
338 __ ǂa audiocassette ǂb ss ǂ2 rdacarrier ǂ3 accompanying material 

However, the soundtrack is considered an integral part of the slide set, so its sound characteristics are included in the slides’ 007 (not as a separate 007):

007 __ ǂa g ǂb s ǂd c ǂf b ǂg f ǂh j

El Quijote universal : 150 traducciones en el IV Centenario de la muerte de Miguel de Cervantes / editado por José Manuel Lucía Megías. (OCLC #985966961)

The OCLC Bib Formats documentation says for field 041:

For works in multiple languages, the codes for the languages are recorded in the order of their predominance. If predominance cannot be determined, record the codes in English alphabetical order. If the code mul (Multiple languages) is recorded in Lang (meaning the item is multilingual with no predominant language), the code for the title (or the first title, if there are more than one) and the code mul are recorded. Alternatively, any number of specific language codes may be recorded in repeating occurrences of subfield ǂa.

The first option is used for this book:

    Lang: spa 
    041 1_ ǂa spa ǂa mul ǂh spa

So where to draw the line? How many languages are too many to list? This decision may vary by institution; for example, National Library of Medicine used the mul code for titles in more than six languages. I don’t know that my library has a limit for how many languages we will list, but with translations in 150 languages, mul seems far more reasonable!


Un catecismo para los negocios : respuestas de la enseñanza católica a los dilemas éticos de la empresa / Andrew V. Abela, Joseph E. Capizzi ; traducción, Francisco J. Lara. (OCLC #956991214)

While MARC 020 $a is for the ISBN for the particular item you are cataloging:

020 __ ǂa 9780813228877 (electronic bk.)

you can include more ISBNs in 020 $z (“Canceled/invalid ISBN”). This subfield is often used to include the ISBN of a different version of the title, such as an eISBN in a record for the print.

020 __ ǂz 9780813228860 (pbk. : alk. paper)

Today I found the ISBN for the (original) English version of this title in the record for its translation into Spanish:

020 __ ǂz 9780813228846

this was included in a $z because though it was structurally valid (correct number of digits, check digit matched), it was an invalid application, being for a different resource.


While cleaning up some records today, I noticed a few videos whose titles appeared in the catalog as:


I wondered if this was some sort of intense art film with no title, or if someone was intentionally or unintentionally messing with catalogers by giving their film a title more often seen as a GMD. (Was this a film about the history of the GMD??)

I clicked through and found that it was a perfectly normal film, with the title:

  $100 a day : justice and reparation in California's
    legal system

And this had somehow ended up in the MARC record as:

  245 00 ǂ1 00 a day : justice and reparation in California's
    legal system

The dollar sign is a particularly dangerous one to have in your data if you’re not careful with your processing. In many languages it signals the beginning of a variable, so “$100” may have unpredictable (or erroneous) behavior. It’s also a common convention for representing the subfield delimiter in a MARC record, so:

  245 00 $a $100 a day

might have looked like it contained an empty subfield $a and been cleaned up in text to form:

  245 00 $100 a day

and then reinterpreted as:

  245 00 ǂ1 00 a day

It’s important to sanitize your input!


Standards: ISO 2709

Did you know that ISO 2709 (the standard of which MARC21 is an instance) is fairly general? It allows for:

  • tags which include letters and numbers (though they must still be of length three, like MARC21′s numeric tags)
  • up to nine indicators (MARC21 always has two)
  • subfield codes of length up to nine (MARC21 always has this length as two, as subfield codes like “ǂb” are two characters long)

The indicator count and subfield code length are encoded in every MARC record’s leader in positions 10 and 11; note that the spec says these should always be “2″ and “2″.

I revisited this standard again recently to determine why some vendor records were causing trouble; their leaders had position 11 set to “0″, indicating that the subfield code length was zero:

    01710nam a2000385 a 4500

though the record was full of subfield codes of length 2 (like ǂa).

This was easy enough to fix in batch (that leader position should always be “2″, so I just overwrote what was there) and I’ve contacted the vendor to let them know about the strangeness in their records. Hopefully they’ll be fixed on the vendor’s site before they cause any more trouble!


Золотая книга сказок / Божена Немцова ; перевод с чешского А. Серобабина. (OCLC #959887720)

Whenever I create records in OCLC with non-Roman characters, I notice that it helpfully adds an 066 field to my record, like:

    066 __ǂc (N

In this case, the code “(N” indicates the presence of Basic Cyrillic script in the record, mostly as a signal to the user that extra processing may be required.

So why the weird code? I’d wondered this for a while, and finally looked it up. Longer blog post coming soon.


Oxidative stress and biomaterials / edited by Thomas Dziubla and D. Allan Butterfield. (OCLC #938383040)

We found brief copy for this title in OCLC, and upgraded it to RDA. One change we made was converting the 260 field to a pair of 264s.

The 260 field may be used to record places, dates, and parties responsible for the publication, distribution, and manufacture of the title, as well as copyright information. This field may be used in RDA records, though several elements map to the same subfields; the RDA toolkit’s MARC Bibliographic to RDA Mapping says that ǂa, ǂb are for publication and distribution information, and ǂe, ǂf are for production and manufacture information.

The 264 field may also be used to record the data above, but in a more granular/specific way: the second indicator specifies whether its 264 field is about production (0), publication (1), distribution (2), manufacture (3), or is a copyright date (4).

While upgrading records to RDA, I always convert existing 260s to 264s to record the most specific information I can while I have the piece in hand, using multiple fields if needed:

    264 _1 ǂa Amsterdam : ǂb Elsevier Academic Press, ǂc [2016]
    264 _4 ǂc ©2016

Reconstrucción del olvido (1991-1992) / Daniel Gutiérrez Pedreiro. (OCLC #945639440)

The MARC fixed field LitF uses a single character to indicate the Literary Form of the work. For example, this work is a collection of poetry, so it is marked:

    LitF: p