Foulonneau, M., & Riley, J. (2008). The future of metadata. In Metadata
for Digital Resources: Implementation, Systems Design and Interoperability ( pp. 187-197). Oxford: Chandos Publishing.
Metadata began as a simple set of functions,
primarily for cataloging. Increasingly, the
flexibility and extensibility of metadata are gaining value. A new challenge facing the cultural heritage
sector is the development of new ways of gathering large amounts of data
(conceptually), relating it to the relevant resources, and using and reusing
the metadata across a wide spectrum of applications. The authors address four trends that are
influencing and will continue to influence metadata work in the years to
come. Automated metadata generation is
predicted to continue its place in digital workflows. Some types of metadata are better suited to
automated generation and others, like descriptive metadata, can pose a
problem. DC Dot is a tool that can be
used to automatically generate metadata from web pages. The tool can suggest keywords after an
analysis of the text. Tools like these
are still in development and manual generation is usually also needed. The second trend is the influence of Web 2.0. The authors predict that user participation,
such as recommendations/reviews, tagging and content sharing will have great
potential for enriching digital library applications. The third trend concerns strategies for
metadata management. The authors suggest
broad and accommodating, yet clearly defined usage conditions for metadata
records in order to provide the best possible flexibility for future use. Lastly, the authors suggest that as metadata changes,
so must the institution’s mission statement.
Particularly, the mission statement should address the issue of cooperation
between institutions. The institution’s
ability to position itself directly inside the circle in which its users and
colleagues exist is directly related to its ability to fulfill its primary
mission.
Friday, April 26, 2013
Wednesday, April 24, 2013
Week 29: Wrapping Up
So, not much can be said of this week. Unfortunately, the work on Vict Bib that other people were doing is not finished yet, so I cannot complete my part. I am quite disappointed and really wish that I could have left here on a high note. Instead of writing about what I did this week, I thought I'd wrap up this blog addressing a question that I actually have been asked multiple times by multiple people outside of the library/information science world. The question is, "what are the differences between HTML and XML?" I have tried in many ways to answer the question, but for once and for all, I want to have my answer here.
So, what are the main differences between HTML and XML? (Not in any particular order)
1. HTML is static, but XML can carry data between platforms, so it is dynamic.
2. HTML has pre-defined tags, while XML is more flexible and allows the inclusion of custom tags created by the author of the document.
3. HTML is more relaxed about closing tags than XML.
4. HTML was created for design and presentation, whereas XML was originally meant to transport data between an application and a database.
5. The most important distinction is that HTML is concerned with how the data looks, but XML exists to describe the data and is concerned with presentation only if it further reveals the meaning within the data.
So, I guess I need to write down a quick breakdown of what I have done during my time here at the DLP.
1. I have rewritten the Schematron for validating TEI-encoded Victorian Women Writers Project texts. Through this process I have had experience with XML, XPath and quality control of electronic texts while increasing my comfort level with TEI.
2. In collaboration with metadata experts across IU Libraries, I helped to define a core metadata set for use in the ICO and Photocat. I developed my analysis and survey-writing skills while juggling thinking both broadly and on a small scale. I became acquainted with MODS while mapping the Photocat fields to MODS and then later tweaking existing XSLT and drafting a new XSLT for the new core set.
3. I was introduced to METS during the METS Navigator project. I came to appreciate the transition and great amount of work necessary to migrate sets of data from one iteration to the next. I did further mapping and analysis work and tried my hand at writing specifications to be turned over to programmers to create a new and improved METS Navigator 3.0 with pop-up found through finding aids.
4. The last project, as short as my time with it was, allowed me to work with CSV files- a format I had not worked with before- and introduced me to online data extraction tools (cb2bib, text2bib). I also came to have a greater understanding and respect for the steps necessary and the care it takes to migrate large amounts of data.
Although I knew that working in any environment with other people demands a great deal of patience, I have to admit my patience was tried multiple times here, especially the last part of my second semester. While I was frustrated a lot in the beginning of my internship, that was more with myself, trying to figure out Schematron. This time, I became frustrated because I felt I had to wait a lot for people to finish their part of the project until I could do my part. I don't blame anyone, and I understand it is just part of a working environment. That being said, I really valued my time here and I can honestly say I've learned more than I could have by just taking classes. I dont' know what the future holds for me, but I know I am better prepared than I was before my digital library life.
So, what are the main differences between HTML and XML? (Not in any particular order)
1. HTML is static, but XML can carry data between platforms, so it is dynamic.
2. HTML has pre-defined tags, while XML is more flexible and allows the inclusion of custom tags created by the author of the document.
3. HTML is more relaxed about closing tags than XML.
4. HTML was created for design and presentation, whereas XML was originally meant to transport data between an application and a database.
5. The most important distinction is that HTML is concerned with how the data looks, but XML exists to describe the data and is concerned with presentation only if it further reveals the meaning within the data.
So, I guess I need to write down a quick breakdown of what I have done during my time here at the DLP.
1. I have rewritten the Schematron for validating TEI-encoded Victorian Women Writers Project texts. Through this process I have had experience with XML, XPath and quality control of electronic texts while increasing my comfort level with TEI.
2. In collaboration with metadata experts across IU Libraries, I helped to define a core metadata set for use in the ICO and Photocat. I developed my analysis and survey-writing skills while juggling thinking both broadly and on a small scale. I became acquainted with MODS while mapping the Photocat fields to MODS and then later tweaking existing XSLT and drafting a new XSLT for the new core set.
3. I was introduced to METS during the METS Navigator project. I came to appreciate the transition and great amount of work necessary to migrate sets of data from one iteration to the next. I did further mapping and analysis work and tried my hand at writing specifications to be turned over to programmers to create a new and improved METS Navigator 3.0 with pop-up found through finding aids.
4. The last project, as short as my time with it was, allowed me to work with CSV files- a format I had not worked with before- and introduced me to online data extraction tools (cb2bib, text2bib). I also came to have a greater understanding and respect for the steps necessary and the care it takes to migrate large amounts of data.
Although I knew that working in any environment with other people demands a great deal of patience, I have to admit my patience was tried multiple times here, especially the last part of my second semester. While I was frustrated a lot in the beginning of my internship, that was more with myself, trying to figure out Schematron. This time, I became frustrated because I felt I had to wait a lot for people to finish their part of the project until I could do my part. I don't blame anyone, and I understand it is just part of a working environment. That being said, I really valued my time here and I can honestly say I've learned more than I could have by just taking classes. I dont' know what the future holds for me, but I know I am better prepared than I was before my digital library life.
Friday, April 12, 2013
Week 28: Working with CSV files
Michelle had asked me to try and figure out how the best way to handle Vict Bib records that have multiple authors. The data will need to be extracted as discretely as possible. The problem right now on the Vict Bib website is that multiple authors are displayed as one entity. Also, there are some glitches where editors, authors and translators are interchanged, mostly when a user performs a search. It seems like a bit of Drupal might solve the problem. The Feeds Tamper module provides preprocessing functionality before the data is mapped to the entity fields. The Explode plugin will "explode" the values into an array.
I then met with Michelle and we discovered together that the CSV file will need to be tweaked to transform the commas into pipes. Commas can be used for other instances, for example, in titles. So, the best thing is to used the "|" instead. However, this requires that we control CSV export. Michelle recommends that we pass on what I've worked on so far to the programmers to see what kind of export they can provide for me. From there, I will start the ingestion of either the CSV file into cb2bib or BibTeX into Zotero. I only have five hours of my internship remaining, so it will have to be one or the other.
I then met with Michelle and we discovered together that the CSV file will need to be tweaked to transform the commas into pipes. Commas can be used for other instances, for example, in titles. So, the best thing is to used the "|" instead. However, this requires that we control CSV export. Michelle recommends that we pass on what I've worked on so far to the programmers to see what kind of export they can provide for me. From there, I will start the ingestion of either the CSV file into cb2bib or BibTeX into Zotero. I only have five hours of my internship remaining, so it will have to be one or the other.
Reading #8
Weir, R. O. (2012). Making Electronic Resources Accessible. In Managing Electronic Resources (pp. 69-86). Chicago: ALA.
Friday, April 5, 2013
Week 27: Working with bibliographic data
After completing the list of Vict Bib fields I moved onto make the decision to use cb2bib for the data extraction. Once I decided that, I began to read manuals on cb2bib configuration. It appears that I may have to do some command line work, so I will need to look into that more. While researching cb2bib I also found some information about BibTeX (the format cb2bib will turn the CSV file into). I learned that BibTeX is a reference management software for formatting lists of references. The software makes it easy to cite sources in a consistent manner, by separating bibliographic information from the presentation. Zotero supports this software and can be used to output BibTeX data. After completing this and bringing all of this together in my mind, I began the process of mapping the Vict Bib fields to BibTex fields. Mapping to me is like a puzzle and I really enjoy that part. It's like translation and as a language person, I understand it pretty easily. While there is not always an equivalence between fields, it's fun and challenging trying to find the closest elements.
Subscribe to:
Posts (Atom)