| CENDI PRINCIPALS AND ALTERNATES MEETING |
National Library of Education
400 Maryland Avenue, Washington, DC
June 2, 1998
WELCOME
Tom Pedtke, Chair, began the meeting at 9:10 am. He thanked the National Library of Education for hosting the meeting. Introductions were made. The agenda was reorganized to accommodate late arrival of several speakers. However, the minutes are presented in the order of the agenda.
Ms. Carroll introduced the program. One of the major goals identified for the 1998 CENDI year was to increase connections with other sectors of the information industry. The sectors represented in this program include academia, the industrial sector, and the Intelligence Community. In all cases, these information industry sectors are users of CENDI agency products. In addition, they serve the same scientific and technical communities. CENDI is interested in how these sectors see their clients' information needs and usage changing over the next few years.
Physical Sciences Information and Academic Research
Dr. Neal Kaske, Head Librarian, Engineering and Physical
Sciences Library, University of Maryland, College Park
Dr. Kaske indicated that when speaking of the academic library situation and its connection to academic research it is important to talk about the information needs of students and faculty, collection management, and serials. What do researchers tell us about their needs? Dr. Kaske is currently studying this issue and how the needs are changing. First, faculty say there is too much information. However, except for the selected journals in their fields, faculty members are not the direct consumers of the information -- doctoral students are. This raises the question, "Who must you satisfy?"
The expectations for information services increase as the students degree level increases. Undergraduates are satisfied with a few sources -- the fewer pages the better. In an informal survey, Dr. Kaske polled the approximately 120 participants in a University of Maryland honors program called GEMStone, funded by NSF. Dr. Kaske identified that less than 1/3 of these students had online catalogs in their high school libraries. However, all used libraries. This increases the need for training and promotion of electronic resources, particularly among the freshmen.
While students are using the Internet heavily, he believes the students are not evaluating the Internet content. Faculty have mentioned that more resources are being used, but the quality of the papers is deteriorating. This is because the quality of the sources used by students on the Net have not been mediated by the librarian's reference interview or by the library collection development process. Students are also not properly citing their Internet resources. There is more plagiarism, with cutting and pasting from the web sites into their own papers without an understanding of citation and acknowledgment requirements. In some cases, as the faculty grade these papers, students are learning that not all information on the Internet is true or reputable.
In terms of collections management, most academic libraries, including University of Maryland, are dealing in a hybrid environment that includes both paper and electronic resources. Much of the emphasis within the library's collection management plan has been on procuring electronic resources. Several years ago, there were less than 25 databases. Now there are more than 200 available from the library's web site. This makes staff and user training more difficult. One of the concerns that has been raised by researchers is that they are losing their opportunity for browsing and the positive serendipity that comes with it. He suggested that in the future, terminals will allow easier browsing.
Access and ownership issues are of prime concern to academic librarians, particularly in the sciences. Who owns the electronic resource? How do you continue access if the company goes out of business? What are the library's rights to "back issues" if they cancel the subscription? The views of the library as archive and the "journal of scholarly record" are still being debated. In some cases, paper copies have also been bought, and there are groups such as HighWire Press and J-Store that are working to provide a central repository for electronic journals.
Cost is also a major issue with electronic resources. They all tend to cost more than just the subscriptions because many publishers want to sell both paper and online together for more than the paper subscription alone. Dr. Kaske is particularly concerned about the continued rise in the cost of journals, including electronic journals. The University of Maryland libraries will spend a large part of its additional budget funds ($1.7M) this coming year, just to cover the increase in cost (11 percent) for current journal subscriptions.
There are several interesting studies on the cost of publishing including that by the Coalition for Networked Information (CNI). The Serials Online Newsletter is also a valuable source of information. While he has spoken to publishers who have explained the costs of publishing, many of them are commercial publishers who are seeking to make profits from the publishing industry. Generally, society publishing is cheaper. This was borne out by a study by Robert Kirby (UC Berkeley) that analyzed the bits per dollar for major journals. In all cases, the commercially published journals were returning fewer bits per dollar than society journals.
The University of Maryland is a member of the Association of Research Libraries (ARL) SPARC program (see CENDI minutes from April 7, 1998). There is a movement underway in the academic environment to move to society publishing. Similarly, a recent JASIS article called for authors to move en masse from the "top journal", if it is getting too expensive, and declare a new, less expensive "top journal". Another point of consideration is self-publishing. Even though researchers can publish on their own or department web sites, they still do not get the peer or tenure recognition from this type of non-reviewed publishing.
The University of Maryland also has extensive government documents and technical reports collections. The move from print and microfiche formats to an online environment for these materials is changing government documents and technical reports librarianship greatly. The changes will certainly provide wider access and delivery to patrons.
While access seems to be the issue, it is interesting to review research by organizations that are doing document delivery via CARL's Uncover Service. If clients are allowed to go direct to these services, the question is whether it will substitute for journal purchase. What they found was the use is phenomenally low.
Sci-Tech Information in the Corporate Research Setting
Suzanne Cristina, Information Manager, UTC Information Network,
United Technologies Research Center
Ms. Cristina discussed the current use of sci-tech information in the specific environment of United Technologies Corporation (UTC). Over the last year, UTC has restructured its corporate libraries from nine distributed physical library collections to a single consolidated one. This was in response to the changing internal and external information environment. In addition to a centralized "Printed Resources Group", the following other elements were created in the UTC Information Network: 1) research analysts; 2) information managers; 3) global information support; 4) INET team; 5) centralized collection; and 6) globally located. The use of the information has been impacted by the new structure of the libraries. The direct impact on researchers is difficult to tell since the consolidated library is located where most of the basic research is being conducted. However, there are more requests from customers for high level information.
The restructuring of the libraries has resulted in new roles for librarians. The librarians are now doing more information analysis. They are directly involved on strategic planning, development, and research teams within the divisions. Much of their time is spent writing analyses and white papers -- synthesizing the information, rather than just locating the information. Currently the information analysts specialize in business and technical information. Ms. Cristina expects technical research analyst positions to be created in the near future.
The librarians are also involved with the Internet/Intranet team that develops the web site and organizes the web-based collections. The Internet and telecommunications technologies have increased the ability of such a large multinational corporation to communicate effectively. Because there are information managers in Japan and Germany as well as the United States, there is support 24 hours a day.
Different search tools have also made the information manager's life easier. Nexus Tracker is used to set up a search strategy. This product puts the results directly on a web page in HTML. This type of product would be extremely helpful with other databases, including those provided by the government.
The information analysts are often involved as second-line support for their researchers. While many will use the information resources themselves, they often request help. For example, the end users are finding some information on ChemFinder, but the librarians are still asked to find more detailed information, such as Chemical Abstracts Registry Numbers. Ms. Cristina has found that the researchers also still want everything printed out.
In response to a question, Ms. Cristina responded that it is hard to gauge the difference in the number of reference queries today versus previously because of the availability of extensive electronic resources via the Intranet/Internet, which happened at the same time as the restructuring and centralization of print services. Record keeping has changed so comparisons are difficult.
In terms of what the researchers want, alerts and updates such as ISI's Corporate Alert Service and Technical Insights are very popular. However, the library still does some customized SDI services. The CBD is generally used via Dialog and Nexus, since the search engine on GPO Access needs to be improved. NTIS is used, but Ms. Cristina would like to have more extensive bibliographic information available from the web site, not just the titles and report numbers. STINet and NASA RECONplus are also available. The researchers are really starting to use Ei Village. One of the main library goals is to make these information sources more available through the Intranet homepage and to publicize their availability.
CENDI members asked if UTC maintained statistics regarding the use of certain resources. Ms. Cristina indicated that the I-net Team did a statistical report on which areas of the web sites were being "hit" and who was using it. However, this is not consistently done, and there is no charge back to the divisions for use of resources. The divisions contribute to the information center on an annual basis. Dr. Kaske indicated that the University of Maryland libraries use high level statistics across the thirteen libraries to describe the mix of products during negotiations with jobbers.
Ms. Cristina then addressed the impact of the cost of serials. There were substantial cuts in the number of serials before the reorganization of the libraries. This has increased the number of interlibrary loans (ILL). Certainly, the location of a single physical collection has increased inter-UTC ILL. She also has noticed more personal paper subscriptions to less expensive journals in use by the researchers, since there is no central library at which to browse them.
In addition to the books and journal articles, there are other kinds of information of importance to UTC researchers. Standards are an important type of information resource for the corporate library. IHS publications are available on the UTC Intranet. Standards are also available as part of Ei Village. UTC subscribes to NSSN/ANSI-enhanced service updates to keep up with standards that are of particular value to them.
Experts are also important to UTC. There is a database of UTC experts. They also use Community of Science, Teltech, and government Information Analysis Centers (IAC) and DTIC sites. A strength of UTC was always the technology transfer between divisions.
Patent information is also searched. PatentWeb is heavily used. IBM Patents and TradeMarkScan is also available. Of particular interest is the degree to which knowledge embedded in patent documents can be managed.
In terms of the public Internet, there is little real research information available. However, vendor catalogs are being used heavily via the Internet. Many users check the Net themselves first for such material. Ms. Cristina considers this a strength of the Internet. As part of the Intranet, the UTC Online Library Catalog is available. This was implemented in January and uses BASIS TechLibplus' Internet version.
Collection development has suffered, due to the reorganization. However, this is considered temporary and plans are to address collection development in the near future.
Ms. Cristina foresees the development of improved intelligent agent technologies. She also expects ODBC searching to linked sites from keywords searched on the Intranet homepage. There is an increased awareness of competitive intelligence. Ms. Cristina noted that it is a growing area for the new librarian/analyst. There are new software products that help to map out and analyze technology trends. Patent analysis is significant. Pratt and Whitney, Canada, has an excellent program.
Ms. Cristina concluded with a list of what the information managers like and a wish list for improved products and services. CENDI agency products were on both lists.
Ms. Carroll mentioned that previous discussions among the CENDI members indicated that some corporations were limiting use of the outside Internet by their employees. Ms. Cristina indicated that some divisions had problems with letting their people have Internet access. A stop list is in place with such character strings as "sex", but this has caused problems with the terms on the list being embedded in key terms with different meaning. They have had to modify this list. Some divisions have "loosened" up on Internet access since it was implemented.
Coordinating Science and Technology Information
in the Intelligence Context
Mary Scott, Chief Scientist, Scientific and Technical Intelligence
Committee (STIC)
Ms. Scott described the changed environment of the intelligence community. In addition to worrying about secrets, they must now be worried about "mysteries" or "intentions". In terms of sources, the environment is moving from Intel-unique to more open sources and commercial collection. It is impossible for each agency to deal alone with this global environment that has followed the Cold War. At the bottom line, the Intel community must build partnerships both within and outside the community. STIC is a loose consortium of the intelligence agencies. The head of STIC does not control the budgets of the member organizations. It is one of three DCI committees. STIC meets twice monthly. There are also a number of working groups based around functions and collection issues. Direction for STIC is received from the National Intelligence Officers.
STIC's role is to provide alerts on technological developments that could impact national security and to respond to specific questions from agencies. STIC is dealing with mid- to long-term issues, such as the Y2K (Year 2000) problem and how to include the proper technology in war gaming simulations.
STIC recently completed a study on the health of S&T Intelligence. They talked to customers about future STI needs. Some were new, such as technology transfer, foreign space, and opportunities for US technology sales. They are looking at long-term scenarios in the multipolar world of 2015 - 2020, where the dynamics of war has become much faster. It is more likely that the US will be involved in asymmetric rather than symmetric conflicts. Regional domination issues will abound with the US in the middle. There are many ways to lose your competitive lead, even in symmetric conflicts. It is also clear that the US goals are very evident because we publicize them. However, this is not necessarily so of our potential adversaries.
The study analyzed S&T capabilities within the intelligence community. In terms of coverage of subject areas among information analysts by man years of experience, there was about a 33 percent drop in manpower with the exception of the area of Information Warfare. There are several areas where there are only one or two people who are community experts. In many cases, these experts are based on "old learning", rather than up-to-date education. The average experience at many agencies is 20 years. On the other hand, the military personnel are more recently educated, but they are more transient. It takes a long time for the military intelligence analyst to learn the corporate context in which s/he must work even if s/he has the technical expertise. There are only 340 intelligence analysts across 40-50 science and technology areas among the total community of over tens of thousands. Conclusions were that analysts are stretched too thin and short-term analysis has been at the expense of long-term research.. This results in a greater chance of missing a critical technology development.
The recommendations of the study include the need for sufficient internal manpower. There is also the need for a career development program for the analysts, continuing education, and the promulgation and development of new automated tools. Connectivity is also an issue. The analysts have classified terminals, but few have easy access to e-mail and open sources. The report also called for the recruitment of "IC Reserves", external experts from government laboratories, universities, contractors, and military reserve units to fill the gap.
Ms. Scott identified several common interests between STIC and CENDI. These include tools for analysis of S&T material, access to foreign S&T data, and satisfaction of the S&T information gaps in both the collection and analysis areas. STIC is currently doing an intensive study of these gaps and how to budget for them.
Discussion
Dr. Siegel asked whether our allies have similar problems and what their strategies are. Ms. Scott indicated that many have similar problems, or worse. However, some countries also have closer relationships with their universities than has historically been the case in the US. There is sharing among the allies. INTELINK-C, the Intelligence Community's Intranet, is available to others.
NLE Technologies and Projects
Dr. Keith Stubbs, National Library of Education
Dr. Stubbs indicated that there are a number of projects that NLE is working on that may be of interest to CENDI members. He chose to emphasize the library's Internet initiatives. The Cross-Site Indexing [URL: http://search.ed.gov/csi/), the Education Resource Organizations Directory (EROD) (URL: http://www.ed.gov/BASISDB /EROD/direct/SF), the Gateway to Educational Materials (GEM) (URLs: http://geminfo.org for project information and http://thegateway.org for the catalog itself), and the Virtual Reference Desk (VRD) (URL: http://www.vrd.org) were described.
The goal of the NLE's Cross-Site Indexing project is to enable customers to find information on any of the 150+ Department of Education-funded web sites using a simple search screen that is available at all sites. To support this project, NLE examined over 40 search engines and tested eight. They selected Ultraseek. The selection was based on its speed, scalability, search interface, and numerous other factors. Ultraseek acts as a spider, indexing the full text content of the participating web sites. This includes the metadata, PDF files, and even Lotus Notes databases. The user can elect to search all sites, all sites in a given category, or an individual site. The search interface remains the same.
Currently, NLE's Ultraseek includes over 230,000 files from almost 300 sites. The majority of the files are sponsored by other federal agencies. Ultraseek is also working on sensitizing the search engine to the content of metadata. It is able to specifically search with Dublin Core, but users must use a special syntax to ask for metadata.
The Education Resource Organizations Directory (EROD) identifies organizations that provide education-related information. It contains over 2,100 state, regional, and national organizations. There are several modes of access -- simple and advanced searching, state maps, and hot/current topics. The state maps are especially popular with users. The contractor maintains the directory of organizations by traditional searching, through phone calls, etc. They are looking to change this to a distributed input system, where the organizations register themselves through a web-based form.
The Gateway to Education Materials (GEM) system is a "one-stop, any-stop" access to educational materials on the Internet. It resulted from a review of Internet resources that indicated there was much unstructured information and hearsay. Searchers weren't able to retrieve information with regard to appropriate grade levels and educational standards when a regular search engine was used against unstructured text. Quality was also an issue.
This led to a standard way of describing Internet educational resources. A metadata format based on the Dublin Core was developed. Fields were added to the Dublin Core to identify grade level and other characteristics needed to describe curricula. The format was developed independent of syntax, and it can be implemented in a variety of ways including XML.
A brief controlled vocabulary was also developed after consultation with educational experts. The NICEM and ERIC thesauri can also be used. However, it was determined that ERIC may not entirely cover the domain, particularly for the sciences. Mapping and crosswalking of the vocabularies are being planned. Catalogers can enter uncontrolled terms that are collected and reviewed as candidates for new controlled terms.
There are other Internet initiatives with regard to educational resources. In order to promote reuse of components, the Instructional Management System (IMS) promotes reuse of educational components in a "legos" building block concept. It is heavily used for distance learning initiatives. GEM has a memorandum of understanding with IMS to make the two metadata systems as compatible as possible. The museum community is also talking to GEM, since objects are of value in the educational process.
In addition to the metadata format and controlled vocabularies, NLE supports the development of software tools. GEMCat is used in a distributed environment to create validated catalog records. Vocabularies can be plugged into the metadata creation software. Over half of the participants are MAC shops, so they will be rewriting the tools in Java to support platform independence. NLE has helped participating sites to map to the GEMExchange format. This is used primarily where there is a local catalog in a different format.
The GEM system is currently a Union Catalog of 1,335 records for lessons plans and curricula hosted by Syracuse, guaranteeing a stable repository. It is possible to filter or extract subsets from the Union Catalog based on a number of criteria. The architecture will support multiple local catalogs connected by a Z39.50 interface which will be added to the catalog later this year. It is maintained in Personal Library Software, but will be migrated to Oracle or Sybase. The University of Washington is testing the use of Access, but it will probably not scale easily.
Quality and standards are central to the success of GEM. The records are not pure description, but include indications of quality. There are links to academic standards. Unfortunately, there are multiple authorities in the area of curriculum quality and the indicators do not map well. The University of Georgia is developing a scheme for rating and applying numeric quality to the records. Teachers and other expert endorsements can be included. Federal education resources have already been ranked in this fashion.
The next steps are to continue to add collection holders to the consortium. They will also be introducing a lesson/unit submission system that will allow those who do not have access to the Internet to provide materials. There will be continued emphasis on quality and standards by promoting links across various academic standards measures. It is also hoped that the resources can be linked to discussion forums. NLE will be investigating natural language processing for user queries and to assist catalogers in assignment of controlled terms. GEM will also be extended to add other types of educational materials. NLE has already used GEM to describe AskA services, and has found it to be extensible.
The last Internet initiative is the Virtual Reference Desk (VRD). This project establishes a national cooperative digital reference service for K-12. There are currently numerous services, but it is difficult for them to know about each other and the specific areas of expertise. The key functions of the VRD include "Meta-Triage" which determines if there are already known answers and how to route a question. Much of this system will be selfservice-based. NLE is investigating automated parsers and knowledge bases to help the humans answer the questions. The Knowledge Base provides the basis for reusing answers. VRD also provides ways for mentors/volunteers and other unorganized entities to become organized through resources, training, and matchmaking. The AskA Consortium is a cooperative network that seeks to provide resources. It is the guiding body for the VRD. A registry of consortium members is maintained in AskA+ Locator.
Other proposed VRD services include "incubators" that link up interested organizations and individuals to create new digital reference services and to provide training, software, and expertise to these "start-ups". Match Making would put volunteers in contact with AskA services that are in need of their expertise.
Discussion
CENDI members asked how the evaluation of sites was done. GEM membership is required in order to contribute. (A consortium with a governance structure, including both users and producers, is being formed.) In order for a site to be included, the user must get something of value from the Internet site for free. There may be "for fee" areas also connected to the site, but these must be in addition to the free material. GEM is really two things -- the protocol and the catalog. The protocol is distributed freely, but the catalog requires membership.
There was significant work done on how teachers, primarily K-12, approach Internet resources. The teacher study showed that teachers do not use lesson plans intact. They modify what they find, using the best from multiple sources. The study is available on the GEM Web site. QED is about to come out with another study. The approach of the teachers in the NLE study may be different because they were primarily dealing with Teachers of Excellence.
CENDI members also asked about the audience for GEM. Dr. Stubbs indicated that, while there are items added to GEM that are for direct use by the students, they are not emphasizing this.
Dr. Aung (NSF) indicated that a study by NSF found there is a significant difference between the approach to resources on the Internet by students versus teachers. He indicated that students are more likely to go to the Internet and explore than teachers are. Dr. Stubbs indicated again that teachers may be restricted by their time, and the teachers polled by NLE may have been above the "run of the mill".
It was noted that the GEM and VRD services use an ".org" domain extension rather than ".gov". Dr. Stubbs indicated that this was intentional since there are numerous organizations involved, many of which are not government. It is hosted at Syracuse, so it was not a major problem to establish the ".org" domain.