Category Archives: New Content

WorldCat (April) Fools

This is the first of a few new blog posts. It’s April 1, April Fools Day, but there is, alas, no foolin’ around here. It’s just bad news, start to finish, with the WorldCat subject entity links that have been in the free ShipIndex database since 2009. Read on, to learn more.

When ShipIndex switched from a personal project to a real company, back in 2009, I put all of the citations that had been in the “project” database, into the free database. Anything new was going to go in to the subscription database. I had been in contact with researchers at OCLC, the very large library cooperative that ostensibly helps libraries manage their resources, and shares those holdings, via their publicly available database called WorldCat. I worked with several remarkable people there, who through the years generated a list of all of the “identities” for ships in WorldCat.

This meant we could find books or manuscripts that were by or about ships. So, a book about a ship is easy enough to imagine – the book The Royal Yacht Britannia: The Official History is clearly about that vessel. Having a specific subject heading about that specific yacht makes it easier to differentiate between vessels with the same name. It also created links to books by ships, which often meant logbooks our individually-kept personal journals by people who were on board a vessel. It was a great way of uncovering a lot of useful content about ships that wouldn’t be found otherwise.

But the folks at OCLC said this content needed to be in the free database, not in the then-nascent subscription database. That was fine with me; it was worth including that content and keeping it freely available. The file has been updated occasionally over the past few years, and has always been in the completely free database.

Two or three weeks ago, I was doing some searching, and looked at WorldCat records. I saw notices indicating that the OCLC Identities project, on which these links were based, was going away. This past week, all the links to WorldCat failed. OCLC has ended this project, and with it, links to lots of content that used to be in the database. They’ve also removed linking by Library of Congress Control Number. You’re just searching by phrase now – this seems like the total antithesis of the ideals behind Linked Data.

I have figured out a way to make these links mostly work. The links are now searching by subject headings, rather than by control numbers or identities. As a result, in many cases, they won’t work effectively. In the old file, there was a search to an identity for a ship named “104”, and it specifically went to the entry for a specific ship with that name. Now, the search is for any entry that has both terms “104” and “ship” in a subject heading, so instead of one or two specific results, you get 38 results. Some refer to ‘cruise 104’ of a different vessel. It’s really too bad. Searches for ships like “Mary” are going to terrible, because they’ll include ships named “Mary Rose”, “Mary Ellen”, “Mary & Frank”, “Mary Smith”, and any other ship that has ‘mary’ as just part of its name – instead of going directly to the ship you’re researching. A search for a single, common word ship name, like “Eagle” or “Union” or “James” or “Monitor” or “Wasp” is going to return any record that has that word anywhere in the list of subject headings, even if the term doesn’t have anything to do with a ship name. Connections we’ve made, between specific vessels represented in WorldCat and other citations for those specific vessels, are probably no longer relevant.

OCLC did some work in creating Virtual International Authority File (VIAF) records for some ships, as well. Again, this was great in differentiating between ships with different names. But as far as I can tell, that is also all wiped out.

I’m disappointed and frustrated by this change, as I am with most of what OCLC has done to WorldCat over the past few years.

I’ll leave with this image I collected from WorldCat a few weeks ago, telling me that a copy of a book I wanted was at the State Library of South Australia, but that library is further than the distance to the moon:

My frustration with WorldCat – and OCLC – is ancient news, but it does just keep getting worse. This is really unfortunate. This is NOT a good April Fools joke.


Wow – it’s been quite a while since I last posted anything to the blog. One would be forgiven for thinking we’d disappeared. But we haven’t. In fact, we’ve been working away, adding new content, adding new (mostly backend) functionality, trying new marketing work, and more. But first, we’ve hit a big milestone — yesterday, we loaded our ONE THOUSANDTH resource! I’ll admit, I’m pretty astounded at that. Here’s a list of what we’ve added since the last blog post:

So, we’ve been working hard at adding new content. Getting to ONE THOUSAND resources is HUGE, in my book! We’ve got more to go, I assure you.

On the technology side, we have converted most of our subscription processing from PayPal to Stripe. We think that’s better for us, and better for our customers. If you have an opinion otherwise, I’d be glad to hear it; for now, we think it will make things easier for users. We have some more work coming soon, this time on the login process.

As always, if you have suggestions for content to add to the database, or questions about how it works, please contact me at comments (at) shipindex (dot) org, and share your thoughts, suggestions, and ideas. Until then, fair winds!

Last few months of new content!

Goodness, it’s been a while since I added a blog post. We do have a lot of new content that’s been added, and some new great functionality, as well — I need to write something about that, since it is its own big step forward.

But for now, let’s list the content that has been added to the ShipIndex database since May:

There’s more to add soon, and maybe enough to get us over indexes to 1000 resources in the database, before the end of the year. We’ll see!

We have a lot of files still left to process, and we’re going to be adding a lot of new files, too, soon. So, as always, there’s more content coming soon. If you know of a title whose index should be added to the database, please do let me know, at comments (at) shipindex (dot) org — now’s a great time to get some more titles on the list, so they’ll be processed soon!

Most Popular Vessel Names in the US

I updated the Merchant Vessels of the United States database today. That’s a big file (~375k entries) and it serves as an interesting collection of personal and merchant vessels.

(There’s a minor error in the import, in that about 10% of the entries – in the Os through Rs – are duplicated. I’m working on correcting that problem. Also, apologies about the layout in this blog post, particularly with the tables. Not sure what the problem is, but I’ll try to correct it.)

Unfortunately, the US Coast Guard has changed their system, and NOAA has dropped their version of the database altogether, so you can no longer link directly to a specific ship. This is very frustrating, but I can’t control other sites’ setups. The URL will take you to the search page, and you can search again for the ship name that you’d found in ShipIndex.

The Coast Guard has also removed tons of personal information about owners of recreational vessels. The remaining information will still be useful to some.

MVUS also creates an interesting opportunity to look at a really large data set, and get a good sense of what vessel names are most appealing to the most people in the US.

Continue reading

More new content, and other new stuff coming soon…

It’s time for yet another list of new content. It has been a while since I’ve added to the list here, and to be honest our speed of importing new data has slowed a bit. But we’re still working at it, and we still welcome suggestions of content to be added. Content work continues day in and day out.

On the other side of things, 2019 was actually a year of a lot of development. We are just about to see that come to fruition, in the database itself. I will explain more about that after it has been released, and implemented a bit. I hope that will be very, very soon.

Until then, here’s a list of content added since the last time I posted here — which was, admittedly, quite a while ago, back in November.

New content:

This list includes five additional Roebuck Society volumes, for those interested in Australian history. These volumes are really tough to work through, and take a lot more time than most volumes. For more about them, read my Roebuck Society blog post from September.

We always have more to add, and we’re working through it as quickly as time allows. If you have suggestions, please do let us know. And watch for more big news very soon!

More new content; more new Roebuck volumes

The addition of new content to ShipIndex has slowed, but it hasn’t stopped. Here’s the list of resources added since my last posting:

I’ve previously written about the Roebuck Society volumes, and all the content that appears in them. They vary a lot, but there are a number in this import as well. If your ship is mentioned in a Roebuck Society volume, look at the entire volume closely, as it can contain a lot of data in a small amount of space. See the blog post cited above for more information on these publications.

Though the addition of new content is slowing down a bit, we’re working on some cool new functionality on the website itself, which you might see if you poke around enough. I’ll write more about that soon.

Now with 3.5 million citations! (Almost.) And more content.

The ShipIndex database continues to grow: according to this screen grab from our home page, we’re just 126 citations away from 3.5 million citations!

This is certainly a new record for content, but it has taken a long time to get here. Several years ago, I had to remove some 380,000 citations from the database, because the online resource containing those citations disappeared. But we’ve been adding lots more content since then, and we’ve recovered and gotten beyond where we’d been.

Here’s content that has been added since the last list I posted. Lots more is in process, as always.

Several titles are worth particular attention. The four volumes published by the Roebuck Society are especially valuable for southern Pacific research, but they’re tricky to use. I wrote a blog post about just those volumes last week, and more titles from the Roebuck Society will be added over time.

Ward’s collection of notes from newspapers, about American activities in the central Pacific, is also interesting — the 7 volume set is remarkable in its own right, printed on heavy paper and with a volume of illustrations and maps, if I remember correctly. It’s organized geographically, which makes finding the entries a bit of a challenge. It’s also probably not a particularly common title, but if it mentions the ship you’re researching, those citations from contemporary newspapers are going to be pretty valuable!

I plan to write a brief blog post about the effects of low technology on this data, regarding the Naval Marine Archive, in the next week or two.

As always, let us know if you have titles to suggest we add.

Adding Roebuck Society volumes

Over the past year, we have been adding a TON of new content to ShipIndex. This should come as no surprise to anyone who’s looked at the blog – pretty much every entry has been just a listing of all the new content we’ve added since the last blog post about new content! Most of that has come from indexes to books, but some has been from online databases and websites.

Recently, we’ve started adding content from a special set of publications. If you’re interested in early Australian history, or Pacific exploration, these will be of particular interest. But they are challenging to search, and challenging to process and add to the ShipIndex database.

They’re published by the Roebuck Society, an Australia-based organization that has published many records about the arrival and departure of ships through Australia’s history. The books themselves are an amalgamation of entries from numerous sources. The content looks like this:

Title page from one of the Roebuck volumes.

A content page from the same Roebuck volume.

It’s tough reading! There’s a lot of information crammed on each of these pages. Luckily, there’s an index to all this madness, but it’s often not much easier to read. Consider this example of an index from the book above:

sample index page from a Roebuck volume

An index page from the above volume.


Processing these indexes has been very hard work, and has taken a lot of time and money to complete. Because of the complexity of the indexes and the associated text, understanding these indexes and how to use the information in them takes some work. If a ship of interest to you is mentioned in a Roebuck book, then your best bet is to track down the book itself. Unlike other titles, it just doesn’t make sense to ask for individual pages, based on the index citations.

Remember that you can almost always get almost any book through interlibrary loan from your local public library. It will take them some time, and it will cost them (and possibly you) some money – so be patient and don’t forget to thank them, and support them financially – but in most cases, other libraries will loan these books to your local library, and they’ll loan it to you.

Once you have the book in hand, find the ship on the index page shown, and then see where and how often the ship is mentioned within the body of the text. The entry in the text will give a summary of the ship’s movements, and provides information about the sources (usually newspapers) from which the data is drawn. Many ships are mentioned dozens and dozens of times. Many entries contain data from multiple sources, so – especially for tonnage – many data points may appear for each ship in the index. The printed index notes sources for some of this data, but we have not preserved those notations here.

The Roebuck society has published over sixty volumes, but not all of them relate to vessel information. We have identified about a dozen relevant volumes to add. Some have already been added, and others will soon join the database.

Here’s a list of what’s been added to the database, and what’s in process. Live, in the database, as of publication of this blog post:

In processing, but headed for the database:

  • Broxam, Graeme, and Ian Nicholson. Shipping Arrivals and Departures: Sydney. Vol III: 1841-1844 and Gazetteer.
  • Broxam, Graeme. Shipping Arrivals and Departures: Tasmania. Vol III: 1843-1850.
  • Cumpston, J.S. Shipping Arrivals and Departures: Sydney. Vol I: 1788-1825.
  • Jones, A.G.E. Ships Employed in the South Seas Trade: Vol I.
  • Jones, A.G.E. Ships Employed in the South Seas Trade: Vol II.
  • Nicholson, Ian. Shipping Arrivals and Departures: Sydney. Vol II: 1826-1840.
  • Sexton, R. T. Shipping Arrivals and Departures: South Australia, 1627-1850.
  • Syme, Marten A. Shipping Arrivals and Departures: Victorian Ports. Vol. III: 1856-1860.


If you have questions, or suggestions on additional Roebuck volumes to add, or other titles to add, or thoughts on how best to use Roebuck volumes, please don’t hesitate to share it here, or send to comments (at) shipindex dot org.

Happy searching!

Yet another list of new content

The ShipIndex data team has been hard at work over the past six to eight weeks, and we’ve added a lot more data. A full list of all content that’s been added since the last update appears below.

Some are short books, or brief websites, but they’ve got unique content you won’t find elsewhere. Some, like the Conway’s volumes, are much longer and have thousands of entries in them. All kinds of content has been added, but we always welcome suggestions for more!

Two weeks ago, we went to the National Library of Scotland, and collected content there that we couldn’t find elsewhere. That’s always a thrill. That content still needs to be processed, so it’s not in the database yet, but will be, eventually. There’s a benefit in knowing that a resource has some information that might be useful to you, even if it’s hard to get, because then you at least know that it’s out there, and you can request it through interlibrary loan. Or, if you travel often, you can use WorldCat to determine which libraries own it, and then when you go near one of those libraries, you have a reason to visit. I, for one, was thrilled to have a reason to add a new library card, from the NLS, to my collection!

Now, here’s a list of the content added since the last update:

As always, send a note to comments (at) shipindex (dot) org if you have titles you think we should add!



More new content

It’s been over six weeks since I last posted a list of recently-added content, so of course there’s tons more waiting to be listed here. In mid-May, the ShipIndex team met up in Washington, DC, and visited the Library of Congress. We collected indexes to a ton of titles that we hadn’t found elsewhere, and we’ve processed some of those titles already. A lot more are still waiting to go through the whole process, and will be added over the next few months.

As always, we welcome recommendations and suggestions for titles that should be added to the ShipIndex database.

The following content has been added to the database since my last update:

As mentioned above, lots more is still to come!