Category Archives: Uncategorized

Old Ship Picture Galleries temporarily down – what should ShipIndex do?

I discovered this afternoon that Old Ship Picture Galleries was recently taken down.

A site on the home page says “If you want this site back, e mail darrenmbrown@optusnet.com.au  and ask him to stop bombarding with e mails. He seems to take exception to me posting some copyright expired pictures that he has paid someone for. I do this as a hobby for the enjoyment of others and just don’t want to know all this animosity – life’s far too short for that! for god’s sake it’s only a picture of an old ship!”

This site had a ton of great images of old ships, and it’s a shame that the author feels bullied into taking it down, but I do not fault him for doing so.

I don’t know when it will be back; I guess I may take the links out of the ShipIndex.org database, at least for the time being, though I hesitate to do that. I’m trying to decide what to do right now. What are your thoughts?

ShipIndex content in library discovery layers

One of the biggest changes in academic libraries over the past few years has been the development of “discovery layers”: collections of paid, unique data that are pre-indexed and then easily searched by specialized search engines.

For those readers not in the library industry, keep in mind that Google, Bing, Yahoo!, and other search engines cannot crawl through data that is in siloed, subscription database collections. That data is limited only to people and institutions that have paid for access to it. So, a big benefit that libraries have held over Google is offering the content inside these databases. Such databases range from big vendors who gather together (or “aggregate”) content from many different sources – examples include ProQuest, Gale, EBSCO, Project MUSE, and a variety of others – to smaller publishers or content providers who generate unique content that they believe they can offer for sale to individuals or institutions. The drawback, however, has been that it wasn’t easy to find all that data – you had to go to each different silo and search that database to see if there was anything of interest there. And of course, first you had to know that each database (or silo) existed.

For a while librarians used “federated searching”, but it wasn’t a great solution. With a federated search tool (and they are still certainly in use in many libraries), the computer takes your search terms and goes out to search each of the many different databases that you’ve selected, waits for all the search results to come back, and then compiles the results together. In most cases, it’s not a very elegant solution, and it’s easy to see why the speed and simplicity of a Google search became so popular – even when the content wasn’t as good.

Google, of course, doesn’t go out and do a search the moment you type words into its search box; it has already reviewed and ‘indexed’ all of that content, and whatever it has indexed is what will be in the search results it provides to you.

So, library database vendors tried to create solutions that allow libraries to compete with Google in this area. Their strong differentiator is that the data they’re indexing is the subscription-based content, rather than data on the free web, which Google indexes. Examples of these are “EBSCO Discovery Service” from EBSCO, “Primo Central” from Ex Libris, “Summon” from Serials Solutions, a division of ProQuest, “Encore” from Innovative Interfaces, and a few others. (As an aside, I was a co-founder of Serials Solutions; I was involved in the sale of Serials Solutions to ProQuest, and remained with the company for a while after the sale; and was slightly involved with the development of Summon. While I will always have a soft spot in my heart for Summon – to the extent one can have a soft spot for a discovery layer, I suppose – I am today very interested in making sure that ALL institution patrons have access to the ShipIndex.org data, through ALL discovery layers.)

Content from ShipIndex.org is now indexed in Summon and EDS, and I’m looking to get it into other discovery layers, as well. Here’s an example of what search results look like at a library that subscribes to both ShipIndex.org and Summon, from Serials Solutions:

When a student does a search for a ship — in this case the Elizabeth Davidson — they find a citation for that vessel in ShipIndex.org, and a link to take them directly to the page for that ship. They didn’t even need to know that ShipIndex.org exists. They search in Summon (or EDS, Primo Central, or another discovery layer) and they find content that they wouldn’t have otherwise found.

To be clear, a library must subscribe to both a discovery layer and the underlying databases, for the databases’ contents to appear in the discovery layer. Discovery layers are definitely not cheap, but they do make a huge difference in improving how library patrons discover the resources that the library already subscribes to.

I’m pleased that ShipIndex.org’s data is in Summon and EDS, and I look forward to doing whatever I can to make it available to users of other discovery layers, as well.

Watch out for inaccurate citation numbers!

The back-end enhancements mentioned in my last post now allow us to do some significant data improvements.

Over the next few days, I’m going to be doing some work to improve how we represent links into the impressive Ship Register Database provided by the library at Mystic Seaport. Initially, we described this resource as “Ship Register (1857-1900) Database, by G. W. Blunt White Library”. While that gives appropriate credit to Mystic Seaport’s library for creating this incredible database, it doesn’t describe what’s in the database, which is even more important. So we’re splitting out the database’s contents into three sections, reflecting the three publications that are included in this single database.

Those three publications are:

  • New-York Marine Register:  1858
  • American Lloyds’ Registry of American and Foreign Shipping:  1859, 1861-83
  • Record of American and Foreign Shipping:  1871-3, 1875-9, 1881-1900

I did find some discrepancies between what’s listed as being available, and what’s actually available. New-York Marine Register for 1857, for instance, is available on the site, but not searchable by vessel name. The same goes for the 1874 volume of Record of American and Foreign Shipping.

When we’re done, we’ll more accurately represent the sources for this data. However, until then, you may notice a dramatic, but inaccurate, increase in the number of citations in the ShipIndex database. We don’t want to remove any data for those who might be using it during this switch-over, and I decided that duplicate information, in some cases, was preferable to missing information. After we’ve imported all the data that’s linked to the new resources (that is, “New York Marine Register”, “American Lloyds”, etc.), we’ll delete the data listed under the old resource (“Ship Register Database”).

I don’t think it’ll take more than a day or so to get all the new data loaded, and the old data removed, but I’m not completely certain. I’ll add a note to this blog post when that’s complete.

ShipIndex in London

When I went to the ShipIndex mailroom today (OK, the Trumansburg, NY, post office), I found an envelope from England awaiting me. What was it? It’s ten passes to this year’s “Who Do You Think You Are? LIVE” exposition, in London, at the Olympia Exhibit Hall. I’m excited about going to London to exhibit at this show in a few weeks, and now I can share it with my ten closest friends!

Since I know very few people in London (and the one I know the best is leaving the morning of the conference), I’ve got lots of spare passes. Please let me know if you’d like one — they’re worth about £22 each! (It’s an expensive show.)

I’m getting ready for the show here at ShipIndex world headquarters — I’ve set up my dummy exhibit space to see how it’ll all go together, and my son has been weighing and filling bags of 100 bottle openers apiece. I’m hoping I correctly estimate how many postcards, bottle openers, and brochures to bring, and that I’ll have everything I need, especially since I’ll be in a foreign country with funky electricity and strange customs.

I’ll be at stand 311. If you’ll be in the neighborhood, please do come by and say hello. And if you want a pass, let me know.

ShipIndex as a gift

Looking for a last-minute gift for a maritime historian or a genealogist?

Consider a limited-span subscription to ShipIndex.org!

You can give a genealogist three months of access to the premium database for just $25. Or give a historian access to the premium database for six months for $45. Or give a maritime history fanatic access for a year, for just $85! This is a one-time payment, via PayPal (and yes, you can use a credit card through the PayPal site).

To make it happen, send a note to gifts@shipindex.org. We’ll need the following information:

  • The recipient’s email address
  • When you’d like access to begin, and for how long

We’ll create a pdf certificate that you can print out or email to the recipient. It will include a username and a temporary password, plus information on how to access the database.

This can be a great gift, for any occasion — from a holiday or birthday gift to a retirement or ‘Thank You’ recognition.

On the naming of naval vessels

The US Navy has not been very good about how or why it names certain naval vessels. Given today’s political environment, it’s no surprise that even the naming of ships has taken on controversial tones with politicians of all stripes looking for reasons to get up in arms.

A story from San Diego describes Congressional dissent over recent naming decisions. It does seem like the Navy would benefit from a significant review of how names are assigned. The Navy would benefit significantly from some standardization in how names are assigned, such as using a certain type of name for a certain type of vessel. Some additional rules, such as not selecting a person until at least, say, ten years after they’ve passed away, would also be valuable. Each action would take a lot of the politics out of the decision-making, I think. The Navy must name its vessels, and this is an opportunity to recognize its illustrious history, and that of the country itself. If the Navy could do something that would reduce the political backlash it receives for its actions, and improve its own profile in the process, it really ought to consider doing it.

It’s a shame that there’s so little standardization in the naming of US naval vessels, and that politicians use these items to make unnecessary political hay, and that nothing will significantly change, regardless of what happens. But we can always hope.

New and updated content

Lots more new content in the past few weeks. The following resources are new, and include three more Navy Records Society volumes:

In addition, I updated holdings for the following site, which corrected some errors and added lots of new ships:

 

As always, please let me know if you have ideas for content to add. I have enough to keep me busy for the next few years, but I always welcome more suggestions!

Great new review of ShipIndex from Charleston Advisor

I got back from the Charleston Conference last night. I couldn’t stay for the conference, unfortunately, but I did get to attend, and present at, a pre-conference. I didn’t present on ShipIndex (though I did meander aimlessly about it while we were working through some technical difficulties and they needed me to say something – anything! – into the microphone…), but I did get some great ShipIndex news while I was there.

ShipIndex was just reviewed in The Charleston Advisor, a well-known and well-respected source for “Critical Reviews of Web Products for Information Professionals”. The review appears in the October issue, and a copy was distributed to all attendees at the Charleston Conference. ShipIndex got 4-1/2 stars, out of a possible 5, and a very positive review. The summary of the review includes this bit regarding content: “This unique, comprehensive and authoritative database provides a wealth of information about ships. Links to external content pull all of the information about each vessel together in one place. It is a perfect database for vessel research.” Regarding pricing, the reviewer wrote, “The database is so reasonably priced it is ridiculous. You get a lot of information for very little money.”

The full review is available online, but costs a whopping $38. (Of course, the journal itself costs $295 for libraries; $495 for others…) Just trust me – it’s very positive.

To top it all off, Charleston Advisor editors gave ShipIndex the 2011 award for “Best Content“! The citation reads “Everything you ever wanted to know about ships has been aggregated in this one Web site aimed at both researchers and hobbyists. The system is packed with information, has a strong user interface and a visually appealing look. This unique service was created by Peter McCracken, one of the cofounders of Serials Solutions.”

ShipIndex also received a “Recommended” review from Choice this summer (June 2011), which described the site as “a needed research tool for maritime history, [and] useful for academic and special libraries with interested clientele.”

Good feelings all around.

Why indexing matters

I’m a huge fan of indexes, especially to magazines (aka serials, or journals), and it frustrates me quite a bit when I find useful journals that don’t have indexes to them. Here’s why.

The most important reason, most definitely, is because an index makes old issues of a magazine useful and accessible. Generally, a person receives and (hopefully) reads a particular issue. After that, the issue is stored, and eventually recycled.

(Or, perhaps, left at the local public library, if it’s not too old. I’m writing this in my local public library, and I have several recent issues of magazines to drop off in the ‘magazine exchange’ area. But the library has an understandable rule that no magazine left here be more than six months old. If that rule weren’t in place, the magazine area would be overrun with decade-old copies of magazines that no one wants, and the library would be left with the work of sorting through and recycling them all.)

When a library receives a magazine, it gets stored on shelves for a while. In niche areas like maritime history, it will likely eventually be sent to an off-site storage facility, as well. If there’s no guide to finding what’s in a given issue, then there’s basically no chance of finding anything in any particular issue. Consider a library catalog’s entry for, say, American Heritage magazine. Published for over 60 years, its subject coverage is represented in bibliographic data by basically a dozen words – and a third are in French, and two thirds of the remaining ones are duplicated. The only unique English words are “United States History Civilization Periodicals”. But with hundreds of thousands of pages in those 60 years, there’s an enormous wealth of information. Which is why they publish their own index to their magazine. Now, all those hundreds of thousands of pages are accessible to anyone with access to the index.

Maritime history publications would do well to make note of this, and to consider how their data is accessed when it’s more than a few issues old. Organizations that publish quality indexes to their resources, and then make that information as available as possible, are to be commended. As one specific example, consider the San Diego Maritime Museum’s publication, Mains’l Haul. Not only do they publish a current index to their journal, they make that publication freely available online. This is so vitally important, and should be aggressively emulated by every maritime history organization, regardless of their size.

People will be seeking articles from the entire run of Mains’l Haul for decades to come, because they take the time to make an index available to all. While it may cost money to do this (though some institutions are able to take advantage of volunteer indexers), I think it’s easy to see ways that that money will be returned in spades, and for decades to come, as people discover that past articles mention something of interest to them, and publishers of such works can then offer reprint services for those articles at reasonable fees, essentially indefinitely.

If a researcher doesn’t know that a person or a vessel is mentioned in a past article, they will not put that publication to use, and that’s a loss to the publisher, to the article’s author – whose work would be useful but won’t be found – and to history in general.

I’d like to make two additional comments:

First, don’t rely on a commercial abstract and indexing service to do this for you; while it’s great to get one’s content indexed in large databases, they will provide, at best, only a cursory summary of each article. They will not be sufficient for someone seeking a mention of a person, ship, or location that’s mentioned in, but not central to, a given article.

Second, a listing of the articles in an issue is NOT an index. (I’m looking at you.) It’s a list of article titles, and nothing more. While I suppose it’s better than nothing, it misses infinite opportunities to guide researchers to the incredible wealth of information that’s contained in a quality scholarly publication.

Please, magazine publishers: index, Index, INDEX! And if you’re really forward-thinking, make the index available for free, to anyone. Put it online as a pdf, as a searchable database, and as a text file that anyone can download and use elsewhere. What you lose in the cost of creating and distributing the index, you’ll more than make up in revenue from providing reprints and back issues, and (perhaps more importantly) in promoting and displaying the importance, value, and reputation, of the journal in question.

The death of the semantic web

I came across some interesting notes while going through old emails the other day. A message from NISO, the National Information Standards Organization, reported that the semantic web is dead, citing a post on semantico. The semantic web is a concept of presenting data in a structured format, usually as ‘triples’ (I am, absolutely, not an expert – or even that knowledgeable – on this stuff, so don’t quote me too far), so a computer can better understand what each term means.

For example, when a computer sees the word “Magellan”, it just sees a word. It doesn’t know if the word refers to an explorer, to a spacecraft, to a mutual fund, a “progressive metal/rock” band, or something else. By defining, through triples, what one means, the computer can realize that one page is talking about the explorer while another is talking about a mutual fund company.

Such semantic definitions have been used extensively in some subject areas, but not at all in most. And one of the great challenges with it is/was solving problems among the “upper ontology” – that is, the layer that connects concepts in zoology with concepts in art history with concepts in electrical engineering with concepts in maritime history, etc. One field may work hard to define its ontology, but if that schema doesn’t mesh with other ontologies, then the systems aren’t really connected.

So I was interested to read of the effective death of the semantic web, and its replacement by schema.org. Schema.org is a nascent project being put together by representatives from the search teams at Google, Yahoo, and Microsoft’s Bing. It uses microformat HTML tags, added to a page’s markup text, to define what something is. This is done for the benefit of search engines – so a “Magellan” that is marked with the tags

  <div itemscope itemtype="http://schema.org/Person">
    <span itemprop="name">Ferdinand Magellan</span>

is clearly a person, while the Magellan that’s tagged

  <div itemscope itemtype="http://schema.org/Product">
    <span itemprop="name">Fidelity Magellan Fund>/span>

is something you can buy. (Note the differences in the end of each first line; the first is “/Person”, and the second is “/Product”.)

(Also: I defined the Magellan Fund as a ‘product’, because one can buy a share of it, but it might more appropriately be an ‘organization’, since there is a ticker symbol associated with it, and schema.org currently has a “tickerSymbol” attribute for Organizations.)

The current schema.org structure is quite limited, and focuses primarily on people, organizations (especially local businesses), creative works, events, and locations. But it’s certainly extensible, and – if it’s generally adopted, as triples were not – it will clearly expand to other fields.

I’d love to take on extending it to vessels. It’d be pretty easy for us to modify our HTML to include these microtags, and if that helps people find the information they’re seeking, then all the better for all involved. But I’m not sure what the proper levels should be. One doesn’t want to have too many levels in a structure like this, but I think that going straight from “Thing” to “Vessel” might be a bit of a jump. I imagine an intermediate step of, perhaps, “Vehicle”, would be appropriate. Then those with interest in cars, trains, airplanes, bicycles, scooters and lots more, would build out their schemas, while we could start a layout of sailing vessels.

It seems simple, but immediately becomes fairly complex. You could, for instance, split up “Vessel” entries to “HumanPowered”, “WindPowered”, and “MechanicallyPowered”, perhaps, then divide by vessel type – canoe, kayak, paddleboat; sloop, ketch, yawl, schooner, brig, brigantine, barkentine, ship, bark, hermaphrodite brig; paddlewheel steamer, ferryboat, fishing boat, battleship, oceanliner; etc., etc. Is that too much differentiation? How do you define a vessel that’s been re-rigged, from a ship to a bark, for example? How, even, do you make it clear that when you’re talking about a ‘ship,’ you’re talking about a three-masted vessel with square sails on the furthest-aft mast, rather than something that floats and is bigger than a boat?

Lots of other terms could be added or defined over time. When the computer can understand what the term means, rather than just presenting the term to the world, it will make it much easier for individuals to draw understanding and make connections from within large bodies of marked-up data.

It would appear that this system, because it’s fairly easily applied, has a much better chance of success than did the original ‘triples’ approach. I look forward to watching it with interest.