Creating a citation friendly bibliography database

Online library system
2013 gave me an interesting and challenging few months learning and building an online reference / bibliographic database (mainly in my spare time). I had some basic criteria: low cost, easy to learn and use, easily able to import and export data, library standards compliant, zotero friendly, incorporate basic libray functions, provide full text access, submission queues,  and allow multiple user access with differential security levels. First I looked at "off the shelf" products and of a multitude of these, Refbase, EPrints, and Koha stood out as candidates. I really liked Refbase, and it ticked most of the boxes, especially with regard to citation, but sadly, lacked flexibility in fields and library functions and needed extra steps to convert our spreadsheet based data for import. Similarly, EPrints did not stand up to the reputation.  Koha, on the other hand, is a full blown integrated library system. It ticked all boxes except for ease of use and import of data...and ease of adding uploaded "full text" files.  In the end, it was possibly too comprehensive, and while much effort was spent in migrating to the MARC standard, the difficulties in handling multiple authors in a Zotero friendly way, led me to look at an alternate path.

I had been reluctant to build a system from scratch, as even though it would give the flexibility we needed, the need to build in metadata harvesting by Zotero was a whole new learning curve!   Take a look at the end product - a fully functional copy of the system (without the full text attachments)  is available here on Digging Stick.

The database was built on PHP / MySQL, using PHPMaker.  PHPMaker is an excellent productivity tool, providing heaps of bells and whistles, and looking after the coding while we concentrate on system design. The nature of the program ensures an understandable core codebase which can be picked up by another developer.

The design was constrained by the existing dataset of 39,000 records and need to bring across all existing fields. The old system was only able to export its data as a delimited text file. The most efficient way to clean the existing dataset and prepare it for import into the new system was to manipulate the data in a spreadsheet. This need for a simple spreadsheet based import system, was a key constraint to design, necessitating a single main reference table design, rather than 'normalised' tables. Its fine as it is, but can be adapted to a 'normalised' design once the need for backward compatibility with the old system has passed.

Tabbed detail view
PHPMaker makes creating differential secure logins, email notification,tabbed pages, export options, and an integrated menu system a dream. A bit of nifty SQL creates a database view to support a submission authorisation system and a system where records can only be deleted by the library administrator or  a 'super admin'. Authorised users can edit the catalog or add new entries but they never actually change an existing record. This allows the system to be 'wound back' on an individual record basis when errors are introduced.

The interesting part was making the database 'Zotero friendly'. The Zotero documentation lists several ways to expose the site metadata (also see info on Zotero translators).

I implemented 'embedded metadata' in this system, using the <META> tags in the header of the record detail pages. The <META> tags in the header are then parsed and used by Zotero. As noted in the Zotero documentation,

"This fairly standard embedding of RDF metadata can use any RDF vocabularies; Zotero supports most major RDF vocabularies used for bibliographic metadata. For details on this approach, see the Dublin Core description. The translator will also interpret metadata expressed in the Google/Highwire key-value system."   In developing this system, I used a combination of Dublin Core, Google Highwire, and Prism RDF vocabularies.

Where to next?

  1. Migrate to the free and open source PostgreSQL to take advantage of its 'Full Text Search' capabilities.
  2. Normalise the database framework
  3. Develop a site specific Zotero translator
  4. Develop a ZoteroRDF import/export  capability to give another pathway for export of large numbers of records into Zotero and to import harvested records from Zotero - this will allow the librarians to take advantage of Zotero's ability to harvest metadata using the ISBN.


Tag: library repository metadata citation zotero postgresql phpmaker