Metadata Standards

Digital Libraries are all about Metadata. You can think of metadata fields as the parts that should be filled out for each record in an electronic card catalog. The thing about metadata though is that it is a nebulous concept that would be difficult to share without a bit of additional structure.

To provide this structure, multiple organizations have issued sets of metadata standards to help organizations create and share records.

Primary among these (in educational circles) is the Dublin Core standard issued by the Dublin Core Metadata Initiative (DC and DCMI respectively). DC defines a standard set of fields to describe objects. While there are other formats that also do this (IEEE’s Learning Object Metadata for one), DC is by far the most widely implemented.

One of the reasons DC is so widely implemented is that it is very easy. There are only a few core fields that should be filled in (Title, Description, Creator, etc.), although there are many optional fields to help describe resources in depth. Additionally, there is no hierarchy in the XML (for base fields), which simplifies matters for the non-technical. Finally, it is extensible. This is important so that the institutions with technical expertise can customize it with qualified vocabularies and specialized fields to fit their needs.

Another reason DC is so widely implemented is that it is the only format required by repositories supporting the Open Access Initiative Protocol for Metadata Harvesting (OAI-PMH), most commonly referred to as OAI. As nearly all archives interested in sharing metadata records support this standard, DC has become the lingua franca among educational digital libraries.

Here is an example of 4 fields from a very simple DC record:

  <dc:title>The Oklahoma Daily</dc:title>
  <dc:description>OU's campus newspaper.</dc:description>
  <dc:publisher>University of Oklahoma</dc:publisher>
  <dc:identifier>http://oudaily.com/</dc:identifier>

If you ignore the dc: stuff, you’ll see that it’s just a way of storing the title, description, publisher, and location of the object. By defining these things in predictable ways, computers can automate the exchange of these records.

When you’re talking about exchanging millions of records, like the NSDL does, that becomes quite important.

Next time I’ll talk a bit more about controlled vocabularies and qualified metadata standards.

This entry was posted on Sunday, September 7th, 2008 at 6:04 am and is filed under Digital Collections. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

The D-Light of Digital Collections