Extracting Official Bulova "Fair Trade" price lists data to a relational database. Help w/ Parsing data

Submitted by William Smith on April 4, 2013 - 10:51am

 

Beginning possibly as early as the 1930's through the late 1960's/1970's, Bulova released quarterly "Fair Trade model price lists" to dealers and distributors (jewelry stores, retail outlets, etc....). To be an official Bulova dealer/retailer during an unknown subset of this timeframe, retailers entered into a contract with Bulova. These contracts apparently varied by state, but the intent was national. To be an official Bulova retailer, the seller contracted to honor "fair trade" prices set by Bulova, as per these lists. They were released quarterly, but Bulova also issued periodic monthly supplements between Spring, Summer, Fall and Winter lists. 

The lists contain detailed information on Bulova watch models, including, in some instances: gold color/content of case, dial color/variation and/or embellishments, official Bulova advertised model names, unique internal Bulova model ID number, and certain characteristics which constitute a variant of the base model.  The lists, however, don't show what the watches looked like.  Vintage ads do.

Much of the early Bulova history has reportedly been lost, so we have been rebuilding this history using vintage ads for model identification.  For many models, we don't have vintage ads on site to help determine characteristics that may constitute variants of a base model, for example.  These price lists may allow us to determine some characteristics for specific model variants until vintage ads or other documentation comes along to confirm.  They could be useful for tentatively ID'ing some models, similar to how crystal specifications are used.

The date range for the six lists we currently have is from November 1954 through Spring 1964.  We are missing many of the quarters within this range, and have only one monthly supplement. These lists are high resolution PDF's.  They are viewable and downloadable from here: http://community.nawcc.org/Resources/ViewDocument/?DocumentKey=5c0d2a19-268f-4f5c-8b25-cdb1ffb2fa28
You will need to have a free NAWCC guest account, or be an NAWCC member to download from the link above.  If you have any troubles downloading them, send me a personal message with your email and I'll FTP them to you.

I'm hoping you folks out there can find more of these price lists to fill in the gaps between the ones we have, or can find older lists dating back before 1954.  Please let us know if anyone has some of these lists laying around.  Most dealers didn't keep the older lists, as each new list supersedes the previous.  They were often simply thrown away, or tossed into a box for storage.  The older lists were probably about as useful as last years classified ads....until now.  No one thought future Bulova collectors and historians would discover their utility.

I'd like to make a database from the information in these lists- to help with model ID'ing.  It's hard to use these data when we can only view them as PDF's.  As they are now, we can do some simple text searches in Acrobat.   However, if they could be imported into a database, then we could search for patterns.  At the very least, we could subset entries by model name and year.  The first challenge is getting these data out of the PDF's to input in the database.  

Is anyone familiar with parsing of PDF's?  Maybe someone out there knows how to do this?  I'll tackle the database once I can get the data out of the PDF's.  I ran production grade OCR on the PDF's of the price lists, but still am having challenges reading the header of the resulting PDF to do the parsing.   I think something as simple as AWK script would work for column data extraction, I just haven't used AWK on PDF's.  

Thanks for any help or suggestions.  ...and I hope folks keep an eye out for more of these price lists.