Information Retrieval

DubMed 1.1.3 now up

DubMed has migrated to a better server. Its database is now mySql and php3 does the metadata linking instead of corba (now gone). see cvsweb for the code, and the project page for more info.

catalog and roads -- search engines for you and me

Both the Ecila (French) search engine codebase, catalog-1.0, and the U.K. eLib endproduct ROADS v2+ are open source and increasingly used tools for building web-based catalogs a la Yahoo. Some eLib folks have explicitly turned to open source as a way to keep formerly well-funded projects going (see the press release describing this decision). Open source as "exit strategy" isn't terrifically sustainable, but a step in the right direction nonetheless.

free Z39.50 implementations page -- who knows one?

the LOC Z39.50 Software page has been up a long time, but is probably ready for someone to do a comparison. can anyone recommend any of these tools (at least one is under GPL)? i'd love to be able to plug onto ZETA, for instance, and pull data out for Jake... anyone tried something similar?

Database Advisor from UCSD

from the DBA Sciences page: "Database Advisor (DBA) was created to aid database users in selecting the best database for their query. DBA spawns a search process for each database vendor, and returns the hits on the query to the user. It sorts these results so the user can see where each database stands relative to the others." DBA is GPL'd, and its components are all free according to one license or another. You can even take it for a test drive...

SLRI: web to Z39.50

the Simon Fraser University Library Research Instrument (SLRI) is "a web to Z39.50 client interface" brought to you by the good folks at SFU. it's an adaptation of the web to Z39.50 gateway developed by Harold Finkbeiner at Stanford, licensed under GPL and recently spied at as well.

CDS/ISIS: tell us more

I've now seen CDS/ISIS and its variants mentioned in several places and am still confused about what it is but here's a brief description nonetheless. from the UNESCO ISIS page: Micro CDS/ISIS is an advanced non-numerical information storage and retrieval software developed by UNESCO since 1985 to satisfy the need expressed by many institutions, especially in developing countries, to be able to streamline their information processing activities by using modern (and relatively inexpensive) technologies. The software was originally based on the Mainframe version of CDS/ISIS, started in the late '60s, thus taking advantage of several years of experience acquired in database management software development." take 2, from the CDS-ISIS user forum site: "Mini/Micro CDS/ISIS is a text retrieval program, designed and distributed free of charge by UNESCO. It is widely used for bibliographic (and other) databases throughout the world, and especially in developing countries." If I understand all this properly, it is basically a non-relational database environment commonly used by libraries and other largely nonprofits (20,000+ of 'em) throughout the world. I pulled down the unix version but can't quite make heads or tails of it. Somebody please explain more... update: collected comments from all who offered are available here.

muscat-0.1.0: Dialog Corp IR library

as seen at freshmeat: "Open Muscat is a high performance open source search engine library. It implements the probabalistic model of information retrieval, and is designed for use in applications ranging from full scale Web search engines to searching through email archives." what this doesn't say: muscat comes from the Dialog Corp. and what it also doesn't say: the muscat 'version' of the GPL is missing a significant section of the Real GPL, including the final paragraph which states "This General Public License does not permit incorporating your program into proprietary programs." which, apparently, Dialog doesn't understand, because they explicitly solicit requests for commercial licenses as well. somebody please tell them about the LGPL...

[Update, years later: IIRC, the post author was an idiot. This was a legit use of the GPL.]

Syndicate content