Friday, March 29, 2013

Week 26: cb2bib or text2bib?

I didn't do many things this week, but I really got into the Vict Bib project.  Earlier in the week I began a list of Vict Bib fields based on a spreadsheet and an examination of possible fields on the website.  This was a loooooong process because there isn't a way to have the website automatically display all fields.  I had to scroll through many records-pages and pages actually-in order to make sure that I had identified all of the fields. 

Later in the week I started reading documentation on cb2bib and text2bib.  The cb2Bib is a free, open source, and multiplatform application for rapidly extracting unformatted, or unstandardized bibliographic references from email alerts, journal Web pages, and PDF files. Text2Bib is a PHP script for converting references to BibTeX format.  However, it seems like it cannot detect some of the document types that Vict Bib uses.  Lastly, I read quite a few forum posts on the subject of data extraction.  So, again, this week was not much "doing", but a lot of preparation for what's to come. 

No comments:

Post a Comment