Friday, February 22, 2013

Reading #4


Foulonneau, M., &  Riley, J. (2008). Technical interoperability.  In Metadata for Digital Resources: Implementation, Systems Design and Interoperability ( pp. 154-164).  Oxford: Chandos Publishing.

This section of the book describes how metadata can be used to represent complex or multi-part resources and outlines best practices for interoperability.  The METS (Metadata Encoding and Transmission Standard) Schema is the most heavily used schema for complex resources in the cultural heritage field.  METS files are unique in their ability to include the set of files that make up the digital resource, structural metadata, descriptive metadata, administrative metadata and behavioral metadata.  Furthermore, the METS schema does not mandate which metadata formats should be used to describe the resource, but more so acts as a container that holds all of the metadata formats involved in describing the digital files together.  For example, the METS Navigator application, developed at Indiana University, uses the structural map and pointers defined in the METS file to access the multiple parts of the object.  Together with a page-turning interface and user-friendly navigation, METS Navigator provides simple, intuitive and high-level browsing, flipping and searching capabilities.  Concerning interoperability, the authors discuss several challenges in technical interoperability, such as information loss and ensuring accurate description of the desired entity, otherwise known as the one-to-one principal.  Metadata mapping, the practical stage of technical interoperability, is outlined with a section on mapping tools.  Methods using XSLT stylesheets, Java, Perl and Python are briefly discussed.   

Week 22: Mapping METS

I finished up the comparison documentation and then started looking at Best practices for METS at IU and reviewing the specs.  I then also looked at guidelines for mapping TEI to METS to see some sample METS XSLT.  I will probably write XSLT to transform from METS Navigator 1.0 to 3.0.  I then read a few chapters from 'XSLT Quickly' by Bob DuCharme and also studied from www.w3schools.com/xsl/default.asp.  I didn't actually do many things this week, but it actually took a lot of time, since a lot of it was reading and studying. 

Friday, February 15, 2013

Week 21: METS comparisons

I met with Michelle again this week to talk about how to start documenting the comparisons between the document for targeted METS Navigator 3 migration with various METS Navigator 1.0 files.  Using a spreadsheet I have started the comparison work.  What I did was break the document into the 5 METS sections that the DLP will use.  Then I created a column for the target and then individual columns for the METS Navigator 1.0 sample documents.  This way, one can see clearly the differences between 1.0 and 3.0 for the Root or File Section.  I also highlighted the parts that I thought were exceptional or contributed to a very important difference.  For example, I noticed that in the logical part of the Structural Map, the TYPE="logical" is what provides the side nav.  If there is no LABEL that indicates each page or section level div then this information will not appear in the side nav.  Or, better said, the side nav will be suppressed.  After creating the spreadsheet, I wrote a clearer and more verbose Word document explaining all of the differences that will then be used to do the transformation.

Friday, February 8, 2013

Reading #3


Marshall, C. C., & Bly, S. (2005). Turning the page on navigation. In Proceedings of the 5th ACM/IEEE-CS joint conference on digital libraries, (pp. 225-234).

This paper presents the results of two observational studies on reading and document navigation behaviors in serials. The first study documented readers interacting with paper serials, and the second study examined the same users interacting with digital serials. The authors discover that readers of magazines, journals, textbooks, anthologies do not read the entirety of these materials or read the documents from beginning to end - rather, they skim, skip, and scan the material. Navigation must support this type of use, particularly by providing some type of "lightweight navigation" and flipping capabilities. Lightweight navigation includes behaviors such as focusing in on specific sections of a page or glances back or ahead in a document. In paper form this might entail folding a newspaper so that only one or two columns of text are visible. In digital form, a reader could zoom in on a page to simulate the same effect. Flipping allows a fast visual scan of a serial. It is more problematic to provide this type of navigation in digital form, but thumbnail scrolling might be one solution. Jumping behaviors that occur during paper serial reading are much more difficult to render in digital forms. Metadata may offer comparable functionality by allowing readers to navigate to subsections of a document or to other articles in the periodical, but this type of navigation is far less fluid.

Week 20: Wrapping up Schematron?

So, this week I finished up the rest of the Schematron work, with the Encoder and Editor checks.  I still had a few questions about the work, and I included these as comments in the file.  For example, I was supposed to test that there was a head tag immediately following a div.  However, that rule would be too strict, since sometimes a pb (page break) intervenes.  In addition, not all the divs have heads, especially the ones that are within floating text.  So, my issue is how to write that so that Schematron catches only the divs that are supposed to contain a head that don't.  So for now I think I'm done with Schematron, but will probably have to go back to it in the future since I have lingering questions. 

I continued the readings on METS Navigator as well and then moved onto analyzing some METS documents in order to understand their architecture.  I learned that there are 7 sections of a METS document: Header, Descriptive Metadata, Administrative Metadata, File Section, Structural Map, Structural Links and a Behavioral Section.  The Header, File Section and Structural Map will be the most important sections for METS Navigator purposes.  The Structural Map is the only section that is required in a METS document and both it's physical and logical sections are the most important for the METS Navigator page turning service.  The logical section provides the hierarchy and side nav so that a user can jump to a specific section within the electronic text.  So,  I'm going to try and pay most attention to this section. 

Friday, February 1, 2013

Week 19: Go METS!

More work this week on Schematron editor and encoder versions.  Not much to report other than I keep finding new ways to test.  I think I've found the best way and then I realize that no, there's a better way that is more reliable or comprehensive.  I started the next step for Schematron which is checks for critical introductions and bios.  The critical and biographical introductions are authored by students and accompany the source literary texts.  There are specific encoding guidelines for each which means that I need to spend a little time learning these guidelines in order to decide what needs to be checked.  Like with the TEI Header and body checks I still need to check for all of the '$' template values to make sure they are gone.  I need to check for xml:id="encoderusername".  For the intro checks specifically I need to make sure that the first part of the xml:id and the <idno> match the related text with regards to the value of @ref.  In other words, the 'nnnn' part of the xml:id="VABnnn_intro" needs to match the numbers in @ref.  For the bio checks I need to make sure there is a notesStmt and to make sure there are two line breaks after each header.  Now that I've had more practice, this has come a bit more easily to me. 

My next project is going to be working on writing the specifications and drafting XSLT for the eventual migration of data from METS Navigator 1.0 to METS Navigator 3.0.  METS Navigator is used here at IU for the page turning service.  Now, most of the page service is stand-alone, but soon DLP wants to change that into a pop-up.  In order to have that happen, the data needs to be moved.  I have started doing some readings on METS Navigator in order to understand its history and capabilities.  I look forward to starting work with METS.  Plus, I like the name, being a New Yorker and all.