元数据 – 第 18 页 – 编目精灵III

2006年IFLA年会编目专业组征文

IFLA年会2006年将在韩国首尔（汉城）召开。编目专业组现在开始征集专题论文。论题：“编目合作：原则、计划与出版者”。具体内容为：

从数字资源中抽取数据，用于书目或其他元数据记录
使用来自数字资源出版者的数据，创建书目或其它元数据记录
（今后真的不用编目员来编目了）

征文信息如下（来自邮件组IFLA-L，网站存档略迟于邮件发送）

World Library and Information Congress
72nd IFLA General Conference and Council
Seoul, 20-24 August 2006

Cataloguing Section
Division for Bibliographic Control

CALL FOR PAPERS
Programme Theme: Cataloging Partnerships: Principles, Projects, and Publishers

The IFLA Cataloguing Section (IFLA CATS) invites cataloguers and others involved in the following projects or activities to express their interest in making presentations at the section&aposs programme in Seoul:
– extracting data from digital resources for use in bibliographic or other metadata records;
– using data from publishers of digital resources in creating bibliographic or other metadata records.

Send a detailed abstract (1 page or at least 300 words) of the proposed paper (must not have been published elsewhere) and relevant biographical information of author(s)/presenter(s) by 15 December 2005 via email to:
Judy Kuhagen
Incoming Chair, Cataloguing Section
jkuh@loc.gov

The abstracts will be reviewed by members of the Cataloguing Section&aposs Standing Committee. Successful proposals will be identified by 31 January 2006. Full papers will be due by 15 April 2006 to allow time for review of papers and preparation of translations; papers should be no longer than 20 pages. 15-20 minutes will be allowed for a summary delivery of the paper during the Cataloguing Section&aposs programme.

Please note that the expenses of attending the Seoul conference will be the responsibility of the author(s)/presenter(s) of accepted papers.

OCLC软件竞赛结果

时间过得很快，今天见到OCLC软件竞赛结果公布，找我以前的报道，发现时间竟然已经过去四个多月了。
得奖者是美国的Dazhi (David) Jiao，看姓名是个来自大陆的华人。获奖作品是一个OPAC，在显示详细书目记录时，包含收割的相关文献一览表（an OPAC that includes a ranked list of harvested citations when a detailed bibliographic record is displayed.）。评委认为其作品创新集成了OPAC与收割的元数据，并充分利用了OCLC的开放软件（an innovative way of integrating OPACS with harvested metadata and made good use of open source software from OCLC）。

OCLC还提供了相关信息链接。获奖作品的链接如下：
Dazhi Jiao&aposs CAT OAI; an OPAC System with OAI Integration
http://129.79.32.196:8080/catoai/index.jsp

这个试验系统目前只包括物理学相关内容。检索结果一览表如同普通OPAC，但具体书目的详细信息包含按相关度排序的数字资料，这些资料由OAI数据库中收割而来。
“The Handbook of plastic optics”一书的详细书目信息，首条数字资源的详细信息，包括名称、URL、作者、内容提要、主题等信息。直接链接到相关数字资源，是一篇全文文献，感觉很不错。

OCLC打算今后每年举办一次这样的竞赛。

无处不在的元数据

印象中以前数据库的”字段”，现在都称为元数据了，各行各业都在研制元数据，电子商务、企业信息、政务资源、统计指标、档案管理、电子公文、信用信息……。原来生成/修改日期、访问权限之类计算机文件的”属性”，也变成了元数据，如MP3文件的元数据ID3，定义了作曲家、词作者、演唱/演奏者等数十个属性；更有数码相片文件的元数据复杂到了定义拍摄的经纬度和海拔。
曾以为Google的关键词检索只需要人工智能分析词间关系，组成一个词表（或许是语义网、本体什么的？），不需要元数据。但看着Google近半年接两连三地推出各种专类检索工具/功能，学者Google Scholar的引文、电视Google Video的节目预报、地图Google Maps的企事业单位信息、电影（movie:命令）的影评与影院信息，以及最近引起广泛争议的Google工具条的网页自动链接AutoLink功能，终于明白其实在Google简洁检索界面的背后，肯定蕴藏着极其复杂的元数据，用以组织机器搜集到的看似无序的信息。

    我们的机读目录MARC有差不多40年历史了，或许可称得上元数据的前辈。定义了那么多字段、子字段，虽然不是都要用，看上去也很烦。于是不满意MARC者设计出都柏林核心元数据DC，来代替烦琐的MARC，只用十多个元素就够了，很爽。可渐渐发现不够用，于是加修饰词，先是标准修饰词，然后又可以自定义修饰词，现在弄得跟MARC也相去不远了。
    其实当深入到事物的内部，必然越分析越细致，需要的元素也就越多，就好象前面所举MP3和数码相片元数据的例子。看出版商描述图书信息的元数据ONIX，近200个元素(tag)，与MARC相比，其烦琐程度可说是有过之而无不及。
    看来，在今后相当长的一个时期里，综合描述各类文献元数据的MARC还是很安全的――不会被淘汰。或许磁带时代顺序读取的产物2709格式，会因与时俱进而被XML或别的什么格式所代替，但MARC的基本字段、子字段应该不会有太大的变化。