All posts by Peter McCracken

New feature: Introducing stopwords

One of the neat things about having an online database is that one can study data to figure out how to make the system work better. This wouldn’t be the case if this were, say, a CD-ROM product.

I can look at all the searches that have been done on the site in the past year or so. In doing this, it’s clear that a lot of people include terms like “USS”, “HMS”, “USCGC” and other descriptive terms in front of the ship name. Others include vessel descriptors, such as “schooner” or “steamer”. For a long time, I’ve wanted to have a way of ignoring those terms, because it will get users to the content they really want more quickly. However, as with most things, it’s not as easy as it seems.

It’s easy to have a list of stopwords — words that are ignored in searches. Many search tools do this, so when you include “the” or “an” in a book title, Amazon doesn’t bother to search for these words. Of course, they still need to make exceptions to deal with searches for the band “The The”, and the like. And in the case of ShipIndex.org, one still needs to be able to search for “HMS” in a name, since some ships do have that as a legitimate part of their name – though none of them are part of the British Royal Navy.

So anyway, I reviewed the list of search terms, and came up with specific words or phrases that need to be ignored. Then we (and by “we” I don’t mean me – I mean the excellent development team that turns these ideas into reality) created the tools to ignore these words, and also show a results message that says, basically, “We ignored this term, but you can repeat the search without ignoring it if you like.”

The result will be a significant improvement in the results that people see when doing their searches. Let me know if you like it or if you don’t.

Most popular ships in libraries, from OCLC

Not long ago I analyzed information about vessels registered in the US and found a list of the 100 most common ship names in America. Folks at OCLC, the online library cooperative, did something somewhat similar lately – and found the 10 most common ships in library collections.

Thom Hickey, Chief Scientist at OCLC, had put together a list of about 50,000 authority records for ships, for me – which went in to the free part of the ShipIndex.org database. To be clear, an ‘authority record for a ship’ is a record that defines a ship as an entity, not unlike a record that defines a person. Ships can be subjects of books, of course, but they can also be authors of books. As an example, the logbook of a vessel is “by” the ship, in addition to being by the person or people who recorded the information – though often their names may not be known.

This new set of 50,000 authority records was a great enhancement to ShipIndex.org (see my blog post about these records, and see an earlier post about how best to use them). It updated a set of about 40,000 records that had been put together for me five years ago, and also gave me a chance to correct, improve, and update this information in the ShipIndex.org database. Again, all of this information is in the free portion of the database. These authority records make a great way of finding books about, say, the Titanic or the Lusitania, and are particularly valuable when one is searching for a whole book about a particular ship.

The folks at OCLC then went one step further, and decided to see what ships were most popular in libraries around the world. They took the list of ships they’d generated, then looked at how many library holdings were noted for each ship. This is a great way of measuring popularity: you’re not looking just at how many books (or movies or other works) have been created about a particular ship, you’re also looking at how many libraries own each of these works.

The results are here; I think it’s no surprise that Titanic tops the list. I will admit I was quite surprised about the rest of the top five: Mayflower, Bounty, Amistad, and Endurance.

Ship name Number of holdings*
  Titanic 260,693
  Mayflower 48,657
  Bounty 35,382
  Amistad 32,464
  Endurance 27,877
*as of 11 July 2014

The OCLC Blog entry has great examples of resources about each ship (check out the image of a Titanic made from dried apples!), findable through WorldCat, and is very much worth a close read. Take a look at the next five most popular ships, too – you might be surprised at what is most popular in libraries.

Maritime history is everywhere – Mark Twain as an example

One of my biggest challenges comes in making clear the role of maritime history in American history and life, and in world history and life. In a way, I think vessels are so ubiquitous that they’re not even noticed. But for most of recorded history, news and information (and with it, human connections) used ships to travel long distances. If you look, everywhere you turn you will see a maritime impact.

I noticed that again today, when I went visited the many Mark Twain sites in Elmira, NY. Elmira is not far from where I live, but I don’t go there often. I knew there were connections with Mark Twain, but I did not realize how many. As it turns out, Twain’s wife and family lived in Elmira for many years, and Twain (well, Sam Clemens) regularly spent summers there.

Sam Clemens’ sister-in-law owned an estate about two miles outside of town (and Twain’s father-in-law was the richest man in town and owned the largest house in town), and she built a small rectangular study for Twain in 1874, which he loved and used extensively, writing much of many titles there in the summertime. The study was moved to Elmira College in the 1950s (Twain’s wife attended the college, and the family had many other connections there as well), and is now open for visits in the summertime, and by appointment other times of the year.

Where’s the maritime connection – other than water weaving its way through all of Twain’s work? Sam Clemens met Olivia Langdon through her younger brother, Charles. Many signs told me that Charles and Clemens met on board the steamship Quaker City, in 1867 in the Mediterranean, where they struck up a conversation, and eventually Charles showed Sam a picture of his sister, and Clemens was immediately taken with Charles’ sister. It took some time before they actually met, and then before Olivia accepted Clemens’ marriage proposal, but it all went back to the Quaker City.

Of course, ShipIndex.org has entries on Quaker City – more than 80 of them – and not all are about the ship that Clemens and Langdon sailed together on, but if you want to know more about that ship, I can’t think of a better place to start.

After visiting the sites at Elmira College I stopped at Twain’s gravesite in Woodlawn Cemetery, where he’s buried with his wife’s ashes (she died in Florence, in 1904), those of his children, and his only known grandchild.

In the end, water matters. Whether you’re studying the life of Sam Clemens or the writings of Mark Twain, water had a huge impact on his life. I’d argue it has that impact on the lives of many, many people, and I will keep trying to convince the world of that.

New content, through early June 2014

I’ve been writing a lot about the 38th Voyage, but in fact I’ve also been working on new content for the database. Here’s a list of content added since the last time I posted such a list:

New content:

Updated content:

So, we’re still plugging away at getting new content into the database, even while preparing for the sail. The free database grew by over 20,000 citations. And right now, the subscription database is at 3.36 Million citations.

Practical points about my 38th Voyage

I head to Connecticut this Friday, to start my brief experience on the 38th Voyage. It is true that it will start on Friday the 13th, but I’m not the least bit worried about that, since I was born on a Friday the 13th — in fact, a Friday the 13th in June, too: this Friday, in addition to starting my Voyage, is my 45th birthday. Sounds like a pretty great birthday present for me, I think!

The Voyagers consist of 8-10 people on each of the nine legs of the complete voyage. Each Voyager submitted a proposal and application, and then were selected by staff at Mystic Seaport. I believe there will be nine Voyagers on our leg (one seems to get added and dropped from message to message…); we’re all listed here, under the first leg from New London to Newport.

We will board the Morgan on Friday night, and will spend the night in the bunks of the fo’c’sle. On Saturday morning, because of tides in New London, we plan to depart at 4:30 am. Some guests, who won’t be sleeping on board, will need to arrive an hour earlier — yikes.

Anyway, my initial thought, when I heard we’d leave at 4:30, was “dang, that’ll be early.” Then I immediately thought “Whoa. We’ll see the sunrise.” and “That just means lots more sailing!” and “It’s only 24 hours; I want as much of it as possible to be at sea.” So now I’m pleased. I just hope I can sleep a bit, the night before.

We will be towed out from New London, by a tugboat that a family has donated (including the fuel!) to the Seaport for use throughout the voyage. Roann, another Seaport vessel, will be accompanying us the whole way, as well. We’ll get towed some distance toward Newport, and then the tow line will be dropped, and we’ll spend the day sailing. At some point, the tug will reattach the tow line, and we’ll be towed into port in Newport, at Fort Adams State Park.

Except we won’t! For some reason, we’ll be anchoring offshore, and taking small boats in to the pier. That sounds amazing to me. I hope we get to help drop those anchors. I also thought it’d be great if we could drop a whaleboat from the davits, as we used to do in Demo Squad, to head in. Who knows; maybe I can suggest it.

As to how much one will be allowed to do, I think each Voyager will pretty much be allowed to do as much as they want, as long as they’re not in the way. I hope to take a pen and paper (and camera) and hang out aloft for a while. And go out on the bowsprit, perhaps. And take a turn at the wheel. And furl a sail. And so much more…

Anyway, that’s what the day looks like. I doubt I’ll be able to post much while I’m on board, though I don’t know. If I can, I’ll post it to the ShipIndex.org Facebook page. I will spend time over the next week or so sharing images and experiences through the blog, and perhaps through some other photo sharing service, too.

38th Voyage Training: Climbing the Rigging

Back in April, when I went to the Seaport for training for the 38th Voyage, one big part of the day was to try climbing the rigging, in preparation for doing so while underway. I have climbed the Morgan’s rigging many, many times, but most often it was 20 years ago, when I worked on the Demo Squad at the Seaport. I’ve been aloft occasionally since then, but wasn’t sure how I’d do this time.

My main concern was my shoulder – I’ve got this annoying “frozen shoulder” that limits my range of mobility, and can occasionally be anywhere from painful to excruciating if it gets bumped in just the wrong way.

So, after spending the day in the Seaport library, I met up with a friend who has worked at the Seaport, and been in charge of the Demonstration Squad, for many years. She had some other friends who wanted to give a try at climbing before the full training day. She has the authority to let us try this when the grounds are closed – or, I imagine, whenever she feels like it.

We went over to the Joseph Conrad, because the rigging on Morgan wasn’t done yet, and it wouldn’t have been advisable to try climbing that rigging yet. I had been dressed for the library, but I gave climbing a try, anyway. We started heading up the foremast rigging, and I was very pleased to see that I did just fine. My lack of range in my shoulder isn’t that big a deal; you mostly keep your arms close in to you when you’re climbing. I didn’t climb over the top, but I felt like that was more because I was cold than because I couldn’t do it.

And the next day, with the other Voyagers, I did climb over the top on Conrad’s mainmast. Here are some photos I took from there.

IMG_3243   IMG_3245 IMG_3246These are all looking forward, from the mainmast to the foremast, where another Voyager (in yellow) was going over the top with the assistance of Seaport demo guy extraordinaire Tim (in orange).

IMG_3249

Here’s a view from the main top toward the whaleboats that other Voyagers were in. We’d done some whaleboat rowing before going aloft. Whaleboats are a blast to row (and sail) and the Seaport received a set of ten new whaleboats for the voyage from a variety of schools and museums.

IMG_3254 Here’s a bit of a selfie from the main top. It was hard to take any good pictures of myself from up there, unfortunately.

The 100 Most Popular Vessel Names in the US

The US Coast Guard publishes something called “Merchant Vessels of the United States”, searchable through their Maritime Information Exchange. It’s a directory of merchant ships over about 5 tons in size. (Smaller vessels that aren’t included in MVUS may be registered by states, rather than by the federal government.) Originally, it was in print, and many copies are still available in libraries or through online sources (here’s one from 1897). Then it was published as a CD-ROM, and then USCG made a database out of it, and put it online.

USCG used to have static, ship-specific links to the database, so you could follow a link that would take you right to the entry about the ship. I discovered some time ago that those weren’t working, and eventually I contacted USCG, and got a reply from them that, yes, static links were no longer available.

I had to decide what to do, and I realize I’d made a bit of a mistake in being too caught up on the static links. If you know that a database mentions a vessel, and you still need to search for it once you get to the database, that’s still far better than not knowing at all. Then, while preparing the updated file for import, I discovered that the Office of Science and Technology, of NOAA Fisheries, publishes its own version of the same database, but with vessel-specific links! So I changed what I was doing, and modified the links so they’d point to NOAA’s version of the database.

I will soon remove the old links to the USCG database, and I haven’t yet decided if I should add an updated version of that database, even though one must still do a search for the ship there. If the information is exactly the same as what appears in the NOAA version, I might not add those links. Thoughts?

Anyway, as part of this work, I noticed that many ship names are used over and over in this database. I thought I’d take this opportunity to determine the most popular vessel names in the US.

Here are some caveats: This data is based on information compiled from the USCG MVUS database. It’s not perfect. Some people put “MV” or “SS” or other terms in front of their ship names, which they really shouldn’t do. Others (many others) start their ship name with “The ”, which I also think they shouldn’t do. (That said, my brother built a rowboat for our father, and I carved a name plate for it, and we called it “The Prelude” – with the article – because it was a reference to, among other things, Wordsworth’s poem of that name [pdf]. So clearly at least some people specifically intend to include an article. Most, however, don’t.)

Also, I didn’t combine different spellings of the same name, like “Meant II Be”, “Meant 2 Be”, and “Meant To Be”. Ship names are obviously very popular places for puns, like “Naut On Call”, and they should be left as such. I also did not combine “Nauti Boy”, “Nauti Buoy”, “Nauti Boys”, “Nauti Boyz”, “Nauti Bouys”, etc., into one name…

With all that said, here are the 100 most popular vessel names, including the number of vessels with that name, from the 365,846 named vessels in the US Coast Guard’s Merchant Vessels of the United States database:

Vessel Name Occurrences
  Serenity 417
  Freedom 382
  Liberty 329
  Osprey 306
  Second Wind 289
  Destiny 285
  Andiamo 262
  Dream Catcher 247
  Spirit 245
  Odyssey 243
  Carpe Diem 232
  Island Time 232
  Escape 231
  Pegasus 231
  Blue Moon 230
  Morning Star 226
  Obsession 216
  Orion 216
  Island Girl 209
  Voyager 195
  Grace 193
  Serendipity 191
  Legacy 189
  Time Out 188
  Escapade 185
  Tranquility 185
  Happy Ours 183
  Summer Wind 183
  Aurora 174
  Phoenix 171
  Free Spirit 169
  Double Trouble 168
  Harmony 167
  At Last 164
  Patriot 164
  Magic 163
  Sandpiper 163
  Relentless 162
  Southern Cross 162
  Halcyon 159
  Mariah 159
  Amazing Grace 157
  Pelican 154
  Endless Summer 153
  Calypso 152
  Whisper 151
  Encore 148
  Imagine 148
  Pura Vida 148
  Seas the Day 148
  Impulse 147
  Eagle 146
  North Star 144
  Zephyr 144
  Wanderer 143
  Ariel 142
  Great Escape 142
  Quest 141
  Raven 141
  Cool Change 140
  Prime Time 140
  Second Chance 138
  Camelot 136
  Hakuna Matata 136
  Mirage 136
  My Way 136
  Panacea 134
  Windsong 134
  About Time 133
  Valkyrie 133
  Perseverance 132
  Journey 131
  Valhalla 131
  Puffin 129
  Patience 128
  Dream Weaver 126
  Restless 125
  Gypsy 124
  Renegade 124
  Black Pearl 123
  First Light 123
  Sanctuary 122
  Sundance 122
  Independence 121
  Resolute 121
  Dulcinea 120
  La Dolce Vita 120
  Sea Hawk 120
  Islander 119
  Moondance 119
  Sea Breeze 119
  Sea Ya 119
  Dragonfly 118
  Liquid Asset 118
  Aquaholic 117
  Dolphin 117
  Oasis 117
  Shearwater 117
  Adagio 115
  Sea Horse 115

38th Voyage training: Visiting the Library and Collections at Mystic Seaport

One of the great treasures of Mystic Seaport is its research collection. Like any museum, they are only able to display a very small portion of their collection at any given time.

I’ve spent a fair bit of time in the library there, both in its old and its new locations, but it’s always great to visit again. I have also often had a chance to visit the CRC, the building where the library, along with hundreds of small boats, and many other special items that can’t be on display, are stored.

Still, I never pass up a chance to visit the place. Here are a few shots of the hundreds and hundreds of boats in the mill:

IMG_1414 IMG_1415 IMG_1416 IMG_1417 IMG_1418 IMG_1421 IMG_1422 IMG_1424 IMG_1427

From there, we went to the collections storage area – ship models galore, paintings, drawers and drawers of scrimshaw, clothes and costumes, nautical instruments, and so much more.

IMG_1432 IMG_1433 IMG_1434 IMG_1436 IMG_1438 IMG_1440

I can’t quite find ways to describe how important these collections, and the work they do, are. There’s so much more to a museum than just the exhibits, and the Seaport’s library, and its excellent staff, are a perfect example of that. Doing research in a collection like this makes it possible for people to learn new insights, discover old truths, and better understand what our ancestors did and why they acted as they did.

But libraries, especially specialized ones like the Seaport’s, generally don’t get a lot of financial support from their institution. Much smaller libraries (the Seaport’s is the largest maritime library in the US, and one of the largest in the world) have even less support, and are even harder pressed to justify their presence or growth of their part of the organization.

I feel certain that there must be ways to better monetize the resources in the research collection. I realize that could sound heretical, and probably sounds terrible. (I admit, “monetize” isn’t the loveliest word – but it is specific and accurate in this case, so I’ll stick with it.) But it is what needs to be done. A library needs to justify its value to the organization by generating revenue. There’s plenty that can be done for researchers that doesn’t involve revenue generation, but there is so much more that can be done for them, as well. And when it creates revenue, it gets attention within the organization, and is seen as a force for growth, rather than a drag on expenses. The fact that something cost money tends to give it greater ‘value’.

I would like to see maritime museum libraries work together to create tools that non-maritime people will want to use, and will want to pay for. I don’t know if it can happen, but if there’s a chance, I’d like to see if I can help make that so.

Training for the 38th Voyage: Visiting the Charles W. Morgan

In late April, I went to Mystic Seaport Museum, as part of training and preparation for the 38th Voyage. As previously noted, I’ve been to the Seaport many, many times, and have lived in Mystic twice – once while studying at the Seaport, and once while working there. So it is not a new place to me, but it is always a favorite.

This time, I went early so I could spend a day and a bit more working in the G.W. Blunt White Library, collecting information to add to my ShipIndex.org database. I got some help in the form of a smart young boy who was there with his mom; she was doing some genealogical research, and his computer wasn’t working, so I asked if he’d like to help me. I gave him a quick rundown on using the Library of Congress classification (because of course he was more familiar with the Dewey system used in his school and public libraries) and then we were off.

I had this view of a model of the whaler Two Brothers, from my desk:

two_brothers

After posting this on my Facebook page, several friends described how they’d seen the real ship, at its wreck site, in Hawaii.

The actual training program began the next morning. All the Voyagers who attended that training session (there was one other) gathered in the morning to do introductions, and learn a bit about the Seaport. As far as I could tell, only one Voyager had never been to the Seaport before. He was leaving early, too, to get to Vienna – I felt like maybe he wasn’t taking this thing seriously enough.

We broke into three groups, and my group was the first to visit the Morgan. Built in 1841, she has just completed a five-year restoration project. Many timbers were replaced, but some originals still remain, as seen here:

IMG_1404

The portion on the left dates to the ship’s construction (so, 1840 or so) and the portion on the right is brand new, with this restoration, to replace rot.

In the hold, we saw some of the extensive fittings and changes made for the voyage. Extensive fire safety systems, and plumbing systems, have been installed for the voyage, including a bank of heads (toilets). While they’re installed in a permanent fashion, most of these will be removed when the ship returns to Mystic in August. Look at all of this:

IMG_1408 IMG_1397 IMG_1399 IMG_1409

Here’s the crew bunks, where we’ll sleep. (We may have the option of sleeping on deck, if we so choose.) We’ll board at about 7pm the night before, spend the night on board, then sail away very early the next morning.

IMG_1388

We then visited the Collections and Resource Center, which will get its own post.

Updated OCLC WorldCat data – 20% more, and more accurate

I’ve updated an important resource, adding 20% to its contents, and improving the accuracy of all of the data in it. When we converted ShipIndex.org from a hobby to a business, we worked with OCLC to get a file of books by or about ships. For more about how these records are used, see the first of two posts about WorldCat records, here.

In any case, we agreed with OCLC that these records would remain in the free database, rather than the newly-created subscription database. There were about 40,000 records in that file. Last month, I had the opportunity to visit OCLC’s headquarters, in Dublin, Ohio. While there, I received an updated version of this file, which now contains over 50,000 authority records for ships.

I worked through the file, doing cleanup and corrections, and spent a few tries at loading the file into the ShipIndex.org database. It wasn’t as easy as other files, because the OCLC records are fully Unicode compliant. The database likes UTF-8, but Unicode is a bit beyond its abilities. (Actually, not in its abilities to display vessel names, but in its abilities to store them.) I replaced vessel names in Cyrillic, Japanese, Chinese, etc., with their transliterated names, and also removed a lot of the Unicode characters that were causing problems.

I also fixed a lot of names that I hadn’t fixed the first time around. Most of these were ship names with prefixes attached, like “USS Daffodil” or “HMS Daffodil” or “S/S Daffodil”. It’s always best to search without those prefixes. I have cleanup still to do on those leftover ship names, but the new records are live and I can do the cleanup later.

So now, as a result, the OCLC WorldCat resource has grown from about 40,000 to about 50,000 citations, and the metadata is much improved. All of these citations are in the free database. This is a big improvement all around. Thanks again to OCLC for creating this file for me!