This is the second part of my blog on exporting MuseumPlus (M+) to xml. I write this part for Frank. In this part, I describe how you get simple XML out of the RTF which produced in the first part. I had written perl, batch and xslt years ago and I was really curious if it still works, -- and yes it does! The outcome of this step is nice xml in a structure which obviously mimicks the "internal" structure of the database.
Maurice's blog
Exporting MPX, Part I
Submitted by Maurice on Thu, 03/11/2010 - 14:06Why and what for?
This is a description how to get XML data out of MuseumPlus (M+). I write this for Rainer and his current project. The idea is to see how this export can be optimized, and how much can be done by others. Originally, I have written this export for ethnoArc. This page describes the first part of the process which could easily be done by someone else.
Technical overview and justification
Tomcat
Submitted by Maurice on Wed, 03/10/2010 - 10:40My Tomcat problem (diagnosis)
I start tomcat. It works, serves static and dynamic pages for a few minutes and then seems to freeze. Every dynamic request times out.
java.exe in task manager shows 46432 K memory usage and doesn't change (with 19 threads).
Attempts
I still have this problem, after I put -Xms228M -Xmx512M in %JAVA_OPTS on my laptop and if I start tomcat from windows shell in the tomcat directory.
Looking for Log files
From my OAI Diaries
Submitted by Maurice on Tue, 03/09/2010 - 11:10I am trying to learn OAI for the MIMO project. I am currently looking at Jeff Young's oaicat. The problem with is that I only had a vague or no idea at all about the OAI metadata harvesting protocol, Java servlets and Tomcat, Ant etc. when I started to look at it.
Tomcat
XSLT Testing
Submitted by Maurice on Sat, 03/06/2010 - 14:21I found this on the XSL mailling list: http://assets.expectnation.com/15/event/2/Testing%20XSLT%20Presentation.pdf
The document as such is great, but now I wonder which approach is the best for me respectively is the eaisest to learn.
At the moment, I look at I http://utf-x.sourceforge.net/manual/manual.pdf, but I am really not sure this is the way to go. One question seems to be with or without Java.
Restructuring this site
Submitted by Maurice on Sat, 02/27/2010 - 16:39Hi everyone!
I decided that this weekend I will take some time to restructure my site. It's the first site I made with Drupal and it looks terrible in everyway. This weekend I change it a little bit, so don't wonder why something might not work.
Maurice
Remote Drupal Admin
Submitted by Maurice on Mon, 10/19/2009 - 15:35I have too many drupal sites now and updating modules etc (i.e. standard tasks) start to get annoying. I wonder if there are already tools to control drupal remotely and if I should write some small scripts using WWW::Mechanize or something else.
Also, I wonder if I should write web tests for the drupal sites I am developing.
This is just an idea. Here are some links I find helpful:
oai_browser.pl -> oai_browser.exe
Submitted by Maurice on Fri, 09/11/2009 - 23:26Here is a windows xp executable of Tim Brody's oai_browser from cpan.
docx2txt.pl
Submitted by Maurice on Thu, 09/03/2009 - 18:45Recently, I installed search files http://drupal.org/project/search_files and noticed that I still cannot search my docx files. So here my first attempt to solve the problem. A little perl script that reads in docx, unzips it, and displays the content in no particular order using libxml. It's my first attempt. Version 0.001 after a few minutes.
Found a similar project: http://docx2txt.sourceforge.net
which uses
Last time i used LibXML
Submitted by Maurice on Sat, 07/25/2009 - 14:47Recently, Martin has transferred recordings from 108 DATs to hard disk. Unfortunately, he didn't preserve the ids on the DATs, but instead segmented the audio according to tracks according to silence. This is not the same, since ids are sometimes one and sometimes two tracks on these dats. Thing is, the recordings were originally stored on 78s. The ids consistently describe these records, i.e. for each record has one id, no matter how many sides it has.
