Thursday, August 30, 2012

Week 2: Building Schematron--The Foundation

So, this week was exciting.  After spending a few good hours looking at the VWWP guidelines and some completed/in progress documents on Xubmit, I began the conceptual mapping for Schematron.  I began identifying elements and attributes that would need Schematron validation, as well as values that need encoder input and watching out for any xml:id's. 

Michelle had created an Excel spreadsheet for me.  Michelle had actually created two spreadsheets, a 'Fluffy' version and then the Structured Assertions or Reports version. The 'fluffy' version was to begin recording the elements and attributes that are to be checked, the xPath needed to identify the appropriate node and any descriptions of what I will be checking (basically a rough version of the message that the user will see if there is any problem with the validity of their XML document).  The Structured Assertions or Reports version includes the context, or the xPath, the test (either the assert or report) and the assertion or message. 

I began with the 'fluffy' version since I thought it would be a good way to understand the architecture of the XML documents.  I wanted to make sure I had mapped out all of the elements and attributes present in the encoded documents and the hierarchies and relationshps.  This took up a lot of my time, but I think it was necessary for me to really get a grasp of what I would be working with.  After meeting with Michelle on Wednesday, August 29th, she gave me a better understanding of what I really needed to check for and it put me back on track.  With that knowledge, I was able to quickly discern which elements and attributes and values needed to be checked.  There are several values that are imported from the MARC record.  For example, in the title statement (<titleStmt>), the value of the author element is pulled directly from the MARC 100 entry.  Therefore, it does not need to be validated with Schematron. 

It was during this time that it was also identified that I would have to pay attention to elements that were affected by the change from using <biblFull> to <biblStruct>.  While this was not a great problem, I have to remember to account for the <biblFull> element and its ancestors when I get to the stage of actually authoring Schematron. 

I finished up my week with writing down several questions for Michelle to review.  I have to say I am enjoying this process.  I get to work with TEI (although more indirectly now) and I'm becoming more confident with xPath.  And the quality control aspect is challenging, in a good way.  It really demands that I make many logical decisions and consider all possibilities of the encoding process.  I have a bit more of the 'fluffy' version to work on, and then I think I can move onto the more detailed spreadsheet. 

No comments:

Post a Comment