Thursday, September 18, 2014

DIGITIZING SHAW - THE SHAW QUOTATION DATABASE

My countless followers and readers - "countless" meaning that I don't usually count them - are surely wondering why the blog has been kind of slow as of late. The answer is simple: I'm digitizing stuff. 

I would like to explain how my database came into being by utilizing the materials I'm currently digitizing as an example. In this case, I got a large parcel in the mail that contained an almost complete collection of The Shavian (in its various formats and denominations) ranging from the first-ever 1946 issue to the latest issue (2014). 



The parcel had been kindly sent by Evelyn Ellis, who took the trouble to put every issue into individual envelopes and include a complete index of what was included. Once open, the complete collection looked like this: 



So, as I usally do in what's left of my lunch break and other lapses of downtime, I lock myself in the photocopying room, where the new scanner of the English Philology Department at the Universidad de Extremadura awaits. 



If you look closely at the above picture, you'll see a black memory drive sticking out on the side of the machine. That's where everything I digitize goes, so that I can later edit the documents with the aid of this OCR software

Given that the Shavians are bound documents, I have to scan them one page at a time, with the additional handicap that they do not fit the standard A4 paper size. Therefore, I have to make use of state-of-the-art technology (a ruler!) and type in the width and length of the issue in question (expressed in millimeters). 



Although you can save any given measurements for later use, you may be surprised to know that not all Shavians are alike (pun intended!) For example, whereas most of them are 280x215 mm; number 7, however, is 250x185mm - to mention but one of the several cases in which I had to input a different scanning size. Once you know the area of the spread pages of a given issue, you just have to type in the numbers - always bearing in mind the orientation; i.e. what the x and y axis are. 



After all this, the process becomes quite straightforward: you just have to scan each page in succession, trying to silence the symptoms of the carpal tunnel syndrome you're slowly developing

Still, there are a few things that slow down the whole routine. For example, many of the issues are missing the staples that bind them, which means that you have to be extra careful not to let the page slant out of the scanning area. 



On occasion, the staples leave ugly rust marks in the middle of the page. Luckily, they do not usually impede the digitation and can be later erased or ignored when the document is edited. 


There are even some issues with holes cut out in them. In these cases, I've had to put a blank piece of paper behind the page, so that I would not scan what's on the following page through the hole. In the next three images, for example, someone had cut out Bernard Shaw's signature from the reproduction of a manuscript dedicatory. The hole, of course, spoiled the running text on the next page and the resulting scanned page, as you can see. 


Despite all these difficulties, there were some unexpected findings that really spiced up this project. Several issues had different kinds of addenda (donation slips, book advertisements, minutes and reports of the UK Shaw Society) that may become invaluable material if someone ever writes a not-so-short history of the Shaw Societies of this world. A few of these are reproduced in a presentation below. 



Well, I hope you liked what I had to share about one of my current projects. At least now you know why I'm not picking up the phone lately.  

6 comments:

  1. I feel your carpal tunnel pain ;-)!

    ReplyDelete
  2. Just an experiment. Did this publish? It still seems to ask for you to select your profile or whatever. In fact, it insists that I choose a profile, and I have no idea what that means. I'm trying "anonymous." It won't let me write in my name.

    ReplyDelete
  3. Found a place to enter my name. All is well, I guess.

    ReplyDelete
  4. Thank you for these heroic efforts, Gustavo! Future generations of students and Shaw scholars will be in your debt.

    ReplyDelete
    Replies
    1. Well, you know, students and Shaw scholars tend to be in debt as a general rule. Impecuniosity - as Shaw would put it - is in the Shavian DNA.

      Delete