CENDI PRINCIPALS AND ALTERNATES MEETING

Defense Technical Information Center
Ft. Belvoir, VA
December 2, 1997

Minutes

National Agricultural Library (NAL): Activities and Plans
The Agricultural Network Information Center (AgNIC)
Digital Government of the 21st Century
STINET: Public and Private Faces
Technology Navigator

WELCOME

Tom Pedtke began the meeting at 9:15 am. Introductions were made.

National Agricultural Library (NAL): Activities and Plans
Pamela Andre, Director

Ms. Carroll introduced the keynote session by indicating the long standing interests that the USDA/NAL and CENDI have shared. NAL has often participated in CENDI meetings. Now that the electronic environment is upon all of us, there is renewed interest in finding out what other STI organizations, including NAL, are doing and how we can best work together.

Pamela Andre described NAL's organization and mission. USDA has been reorganized, placing NAL [http://www.nal.usda.gov] under the Agricultural Research Service, which is under the Under Secretary for Research Education and Economics. This puts the NAL in the midst of the research stream.

Despite the change in the organization structure, the mission of the NAL has not changed. There are three main emphases: 1) serving as the nation's chief agricultural resource; 2) relationships and activities with national organizations that include the land grant universities and the agribusiness community; and 3) international activities, where NAL is a member of a number of organizations that provide the structure and access for the world's agricultural literature.

NAL is the largest organization in the world that focuses on agricultural information. It has over 3.2 million volumes in its collection in all forms and formats. It has a focus on far more than traditional agriculture. The emphasis reflects the budgeted programs of the department, such as rural information, plant genetics, and global change. Sixty percent of USDA's budget is focused on food and nutrition. Sixty percent of USDA staff is related to the environment.

Ms. Andre directs the NAL with three associate directors in charge of Technical Services, Public Services, and Information Systems divisions. Technical Services includes Acquisitions and Serials, Cataloging and Indexing and receives 23,000 journal titles and 15,000 books per year. This is approximately 150,000 pieces annually. NAL is acquiring more non-print media, particularly CD-ROM. There is also a large historic collection. The archival files from many organizations have been deposited at the NAL, and there is some true art among the botanics and pomology illustrations within the collection. Recently, a collection of original letters to and from Thomas Jefferson were located within the collection of a retired researcher. So many donations have been received over the years that it is impossible for the staff to catalog them all. It is difficult to tell what the library actually has in terms of the history of agriculture.

Agricola, the bibliographic database, reflects the NAL collection and those of the land grant colleges and universities. The original concept of Agricola was that of a union catalog. Agricola also includes selected journal article records. The concept of the current bibliographic database is being impacted by Web concepts. Agricola will be available via the Internet in 1998. However, the print bibliography still exists and it is likely to remain for some time.

Public Services receives approximately 60,000 reference requests annually. It also supports NAL's mission by processing 200,000 document delivery requests on an annual basis. The USDA installations are so geographically dispersed that there is not a lot of walk-in traffic to the library. However, the number of document delivery orders is high. The requests are received by phone, letter, e-mail and fax.

Discussion:

***CENDI members asked about the cost of document delivery. Ms. Andre indicated that the stack management contract value for this function is approximately $1 million. However, this does not include other expenses such as copyright clearance royalties. Of the 200,000 deliveries, approximately 80-90% are from the USDA, and they are not charged for this service. ILL is performed for the land grant colleges and universities on a quid pro quo basis. While an average journal article costs the NAL about $20, they don't see this money back. This is a continuing concern for NAL because of the flat budget. There have been discussions in the past about charging the USDA for some services, but this would be a hard sell.

DTIC mentioned that they had tried charging for document delivery within DoD several years ago. However, within two years after the charges were levied, the administrators decided to make it all free because having to pay was considered an impediment to the advance of research within DoD. ***

The ARIEL system is used for document delivery transmission via the Internet. It is used by many cooperating libraries, including those of several CENDI organizations. ARIEL was introduced about two years ago and it has seen surprising growth. More than 10 percent are filled this way, which is cheaper for NAL.

NAL has developed Specialized Information Centers within the library to better support high profile issues. For example, there are specialized centers for rural information and food and nutrition. Some Specialized Information Centers are directly funded by the NAL budget and others are funded by specific appropriations. There is no physical separation of the collection, but the focus within these "centers" is to hire people with the proper scientific and technical backgrounds. The professionals identify special collection development and service needs for their communities, perform outreach, and specialized reference services.

The Information Systems Division is in charge of maintaining the VTLS automated library system, which was installed at NAL in 1988. It is time to upgrade the system, but the budget situation has caused delays. This library system is extremely important, since it is used by almost everyone within the library every day.

In addition, Information Systems has spearheaded various technology projects. The National Agriculture Text Digitizing Program was one of the first CD-ROM projects begun many years ago. They are now working with professional agricultural societies to transition them to electronic products, primarily CD-ROM. The societies include the American Society of Agronomy, the Extension Workers, and Agriculture Engineers. This is done on a cost reimbursable basis. The products may include both text and images. In some cases, the text is OCRed.

The Internet will be the environment for NAL's future document delivery and database development. ISIS, the library catalog, is on the Internet and Agricola will be available in 1998.

The national cooperators are extremely important to NAL and its mission. Principal among these are the Land Grant universities. Forty-five of them were involved in the Text Digitizing project. They are also involved in evaluation, collection and preservation projects. A program plan for systematic preservation of agricultural literature has been accepted as an NEH grant.

Principal among the international cooperators is the FAO (AGRIS) of the United Nations. AGRIS is an international system for Agriculture Science and Technology. It is a cooperative venture of 190 countries. The collection and bibliographic control is performed by the individual countries and the records are shared with AGRIS, which maintains the global system. Within this system, the countries lend to each other free of charge. This is especially necessary for developing countries. AGLINET is a growing model for how we can share resources.

NAL is the U.S. AGRIS center. Approximately 50 percent of the AGRIS database is contributed by the U.S. Approximately 75 percent of NAL's coverage is given to AGRIS. AGRIS is not interested in the state and experiment station reports. The material is reindexed when it is received by AGRIS, because it uses the AGRIVOC vocabulary and NAL uses the CAB vocabulary. There have been several attempts to combine the vocabularies but with little success.

NAL supports other international activities as requested by the Department. These projects may involve the Foreign Agricultural Service and USAID. There is particular interest in Latin America and Egypt.

Many of the NAL's resources are acquired through its 5,000 exchange agreements. USDA agencies that publish are partners in this program. They make several copies of the publications available to the library, do the routine mailing to the exchange partners, and help identify agreements that might be of value. The value of the exchange agreements is approximately $850,000 per year. Ms. Andre is routinely asked to explain the value of the exchange program in times of lean budgets.

The NAL's Electronic Information Initiative is an outgrowth of a strategic planning process conducted about five years ago. This planning focused not only on technology but on issues such as copyright, intellectual property, and costs. It acknowledges that there are differences in the electronic environment (differences in how you catalog, index, access, license, charge, etc.) NAL is systematically looking at operational choices that will move it in the electronic direction. This often involves redirecting staff and funds. Commonly heard is "Change is great. You go first!"

Another effort has been in the area of preservation of electronic media. NAL is involved with Cornell University on this project. The goal is to focus the department on the need to save digital publications. A successful two-day meeting was held last Spring at which NAL brought together stakeholders to talk about the issues and concerns. Several CENDI agencies were involved as speakers or attendees. The report of this meeting has been reviewed and approved by the USDA CIO. It will be distributed to participants (including those CENDI agencies) in the near future. This meeting also got the attention of the USDA CIO resulting in a USDA-wide communication to support the importance of preservation.

An Electronic Media Center has been developed to provide seamless access to electronic resources. It has about 600 titles, 100 remote publications, 19 databases, and provides public Internet access. This center is part of the transition to electronic services. They are hoping that as users become accustomed to electronic resources they will want more. This interest will help the NAL get funding sufficient to provide electronic access more broadly, including directly to the desktop.

The NAL Web site has been available for several years. It receives approximately 5 million hits annually and the number is growing. Important resources include the Food Guide Pyramid Database and the Constructed Wetlands Electronic Bibliography.

Discussion:

Ms. Andre was asked about plans to receive bibliographic or full text records electronically. She indicated that they have held negotiations with publishers but with mixed success, particularly with regard to beneficial pricing for the NAL.

The USDA has also turned to the NAL to provide expertise in the areas of database development and information management. The USDA Plant Genome Research Program produces information which is managed by an NAL database. This is a $16-$18 million project. The central database is made available online and in CD-ROM. NAL is also responsible for the Global Change Assisted Search for Knowledge (ASK) system for searching the Internet.

The Agriculture Network Information Center (AgNIC) has been developed via the Web to bring organization and structure to agricultural related Web resources. The URL is {http://www.agnic.org]. The goal is to provide a single point of entry and use the skills of librarianship to provide stable, high-quality information with descriptive records. It is a collaborative project with the land grant colleges and universities, and seeks to build on the subject strengths of those institutions.

Ms. Andre was asked about the degree to which bibliographic records will continue to be emphasized. She indicated that FAO and AGRIS are in the process of looking at the whole bibliographic database system. They appear to be de-emphasizing the bibliographic record in favor of full text. While NAL will be involved in the planning process, NAL itself is not de-emphasizing the bibliographic record. Instead, NAL is looking for a marriage that will provide quicker access to the actual document.

Mr. Molholm indicated that the recent GILS evaluation by Chuck McClure found that some bibliographic records include more information than people really want. Ms. Andre indicated that some people want just a few really good things. Therefore, there is a need to emphasize

higher quality resources. It is the critical evaluation of resources that is important. The implication is that there needs to be a stronger reference dialog available in the new environment.

The Agricultural Network Information Center (AgNIC)
Richard Thompson, Computer Specialist, Information Systems Division, NAL)

AgNIC is based on the common vision that no one institution can provide all the information, even if it is in electronic form. AgNIC is an alliance between the NAL and several universities including Nebraska, Cornell, and Iowa. There are about a dozen or more universities and international organizations that are waiting to join the alliance. The basis for the service is a database of metadata descriptions. There are approximately 750 resources described in this database to date.

The project has three emphases:

Within the AgDB, each university takes responsibility for a subject category or categories (the Agricola codes are used). The AgDB includes both electronic and non-electronic resources including CD and tape references.

The metadata format includes the name of the resource (with the URL displayed), a description, the format, size, and other pertinent links that are in the description. NAL tries to actively maintain the integrity of the links, but providing the actual URL allows the user to back up to higher levels of the site that may still be present, even if a 404 message is found at the lower level. The AgDB emphasizes direct access to the resources of interest, often bypassing the homepages if necessary.

Discussion:

CENDI members asked who creates the metadata records. Mr. Thompson indicated that for start-up purposes the initial records were created or contributed by the libraries. The resources already available within the libraries allowed them to use "cut and paste" to more quickly produce the records. The Canadian library is also providing records. However, the intent is that the descriptions will move back to the researchers as they create new resources and the process becomes more "formalized".

CENDI members also asked about how organizations join the alliance and what the responsibilities are for membership. Mr. Thompson indicated that guidelines are now being developed. The guidelines will likely include a reference to the subject areas to which the organization is assigned, what the organization's obligations are, and what happens if the organization can no longer continue to support the alliance. A full-time coordinator has been hired to work for a year on the process.

The current metadata has been developed without real standards. NAL is now looking at options to ensure a standard interoperable format that can be mapped to MARC, GILS, and the Dublin Core.

Mr. Thompson showed several of the resources available from AgNIC. The collaboration with scientists is extremely important. The Alberta Barley database includes 133 farmers using computers via the Internet to determine appropriate seeding rates. In other cases, NAL has been able to locate and make available numerous resources of value to the community that could not otherwise be made available. For example, they created a database from an Excel database held by a group of scientists. The database is searchable and displayable in table form.

Discussion:

Mr. Molholm raised the fundamental question of the inclusion of e-mail when experts are listed. Mr. Thompson indicated that this issue has been discussed but, at this point, it is up to the individual scientist. There is no overall policy. Databases such as the Directory of Experts in Agriculture may be where resistance may occur to having e-mails provided. This database is still in its infancy so it is too hard to tell. This directory is actually based on a collection of 40 expert directories that were developed elsewhere and then organized by Agricola Code. The scope notes of the codes are available to help the user determine the category in which he is interested. The scope notes are also helpful in providing keywords for searching. The subject categories are provided in hierarchical and alphabetic views.

Dr. Wood asked if NAL had encountered organizations with databases who wanted them included but which NAL could not include. Mr. Thompson indicated that they would tell them "no" and why they could not handle the database. It would usually be an issue of quality or scope.

NAL also identified a lack of event information (calendars) in the field of agriculture. Therefore, they have collected or created online calendars in 135 subject areas. They proactively search listservs and include the items in the calendar. There are approximately 900 meetings listed by six subject areas. In addition to the basic information about the conference, NAL tries to provide more information about the locale which may or may not come from the conference planners. They will link to Chambers of Commerce, Hotel organizations in the area, etc. in the area where the conference is held, so the user has access to relevant information such as weather, travel and restaurants. The system will be helped by a planned move to a SPARC platform. This will allow for menuing and indexing.

ProMed is a service that announces plant disease outbreaks internationally. They are now looking into a similar system for animal diseases.

The Online Reference Services have set up a page for AgNIC. They have also set up appropriate reference tools. This is like a reference shelf for frequently asked questions. However, if a question cannot be answered by self-navigation through the material, it will be sent to a librarian within the AgNIC system.

CENDI members asked how the initial start-up program was financed. Mr. Thompson indicated that initially the colleges and universities were given financial incentives, now others are joining with no incentives and bear the resource burdens themselves. NAL is looking at some grant possibilities, but often NSF funds, for example, are not available for government agencies. The organizations do not have to contribute in all areas of system functionality--some may choose to provide a database and not contribute to the Online Reference Service. It is not NAL's desire to fund the contributions of the members of the alliance, but to get them started and provide the underlying infrastructure.

Online reference assistance of a general nature is provided at no cost. The normal rules for cost services will kick in after that. Some reference questionnaires are provided online as part of the reference process in order to help the answerer focus on who the requester is and the type of information that would be valuable.

The AgNIC system is also interested in online journals. They have made the Range and Land Management Society's journal available online. The AgNIC system requires two full-time people plus a part-time supervisor and a part-time WebMaster. In addition, Mr. Thompson spends about 60 percent of his time on the project development.

The hits to AgNIC are increasing even though NAL has not aggressively marketed the service. AgNIC server statistics, including file sizes, are available via the server. The file sizes are also provided online, though these statistics are a little misleading because of the size of graphics included on various portions of the page. The hits are averaging about 200,000 per month. The links are automatically checked, but there is also a team of volunteers who check the site manually.

Digital Government of the 21st Century
Larry Brandt, National Science Foundation

Mr. Larry Brandt presented an overview of the federal information technology context, described the discussions that have led to the vision of a Digital Government, and then described the program goals, current status, and plans.

In the past, efforts within the government have focused on the development of technologies. However, these same technologies have not often been well applied within the government mission agencies themselves. Recognizing this, the CCIC (Committee on Computing, Information and Communications) Applications Council was chartered in FY 1997. Its goal is to promote early application of computer, information and communications technologies to critical government missions by supporting multi-agency leadership and cooperation. Administratively, the CCIC reports to the National Science and Technology Council of the Office of Science and Technology Policy.

The Applications Council has several working groups that are applications-oriented and work across multiple agencies to move relevant technologies into the government to solve particular problems. The working groups include crisis management, federal statistics, the Next Generation Internet, universal access, and privacy and security. Federal statistics has moved very quickly in terms of producing results. There is also significant emphasis on crisis management.

Mr. Brandt emphasized the methodologies within the government that have been designed to transfer technologies into government information systems. The Government Information Technology Services Board (GITSB or GITS Board) has provided seed money for multi-agency information technology projects. From FY95-98, the Board has funded 38 projects at $5-7M/year. The money is derived from the FTS 2000 agency billings. However, the payback requirement to keep the fund self-sustaining has been a drawback to the use of these funds by some agencies. This requirement can be waived. Funding through the GITS Board remains a source of additional funds for information technology projects.

Other activities have included the Information Technology Management Reform Act (ITMRA) (Klinger-Cohen) that gave agencies greater procurement control over information technology and information systems. This bill also created the CIO Council and required agencies to submit strategic plans for the incorporation of information technologies. The emphasis on technology transfer can also be seen by memos from OMB Director Raines.

The Federal Web Consortium was developed in 1994 but is still quite active in the areas of training, technology transfer, and collaboration with OMB. Twelve to fifteen federal agencies provide funding for this through NSF. The funding has averaged $1M/year for FY94-97.

Discussion:

Mr. Pedtke said that the Federal Web Consortium is best known for its development of Mosaic, along with NCSA, but he asked for an overview of recent activities, since Mosaic has been overtaken by commercial technologies. Mr. Brandt indicated that the Consortium has been involved in standards development; the OMB policy for federal Web sites is going to be related to the Consortium's Guidelines. The Consortium has conducted three annual workshops and trained 800-900 people in Web development. Increasingly, the training is aimed at more specific topics in one-day training seminars. The NCSA server technologies have been transferred to other commercial sectors, including Apache, which is now extremely prevalent within the industry. The original group involved in the Web Consortium has changed and new members have come on board. The NCSA contract and the supercomputing centers have recently been reorganized. There are now two supercomputer centers (San Diego and Illinois) rather than four. However, NCSA's Partnership for Advanced Computational Infrastructure brings in not only NCSA, but the supercomputing centers of 80 private sector partners. Mr. Molholm mentioned that many of the technologies being discussed by the Web Consortium cannot be implemented immediately, but it is important to look to the out years, to be informed, and to be prepared. On December 11, there will be an Open House at NSF that will highlight the activities of the Consortium.

Despite these efforts, there is still a disconnect between the information systems and technologies in the research community and the application of these technologies among government information systems. There are unique federal problems to be addressed. The aim is to link the research agenda into real needs of the agencies; i.e., to bridge between Access America -- how the government should conduct business, and the Research Agenda Report (Toward Digital Government of the 21st Century) [http://www.isi.edu/nsf/prop.html] -- what research projects the government should fund.

Realizing this issue, the Applications Council, NSF, the GITS Board, and the NIH National Center for Research Resources sponsored a workshop in May 1997 attended by 80 participants from federal agencies, universities, and industrial research institutions. The outcome was the report Towards the Digital Government of the 21st Century, from the workshop organizing committee chaired by Dr. Herbert Schorr (University of Southern California) and Dr. Salvatore Stolfo (Columbia University). Key recommendations included the need to coordinate multi-agency R&D efforts, to bridge the culture gap between researchers and federal information systems, and to establish a program of applied Digital Government research. The latter includes pilot projects and testbeds, technology transfer, training and education activities, and human exchange between the private and government sectors.

The Digital Government Vision is to move from a vertically integrated legacy system through an interoperable Internet to an integrated enterprise system that provides information to the citizens of the 21st Century. This vision requires a substantial investment in the development and application of "snap-in middleware services" that can support legacy systems. There are needs for specific Web services such as indexing, visualization, and electronic commerce. The Applications Council is interested in involving the private sector in the process. Currently, 60 percent of the funding is from non-NSF sectors. The Fortune 50 companies are extremely interested and involved in the development of these technologies. However, the Digital Government initiatives must deal with federal problems. For the federal government, a key issue is the degree to which vendors can wrap legacy systems with an object interface that makes legacy data available but does not necessarily require upgrade of the legacy system.

The Digital Government's program goals are to:

In addition to the cross-cutting domains already supported by the Applications Council working group, there are other domains that cut across federal agencies that can be aided by new applications. These include distributed GIS systems, public health, electronic grants administration, the regulatory process, and the integration of federal, state and local information. Additional workshops can be held on these and other topics based on interest and support from the agencies.

Mr. Brandt believes that the Digital Government Program can benefit agencies by creating an environment and process for collaboration and problem sharing and by helping to forecast commercial technologies that are 3-5 years into the future. There is also the opportunity to leverage resources. The program can provide agencies with talent from other sectors; the Computing Research Association will fund tenured faculty at agencies for one year.

The Digital Government Program has been provided with "seed" funding for the next three years from within the Computer and Information Sciences area and by others within NSF at a rate of $1/M per year. In an effort to continue ties with the other information technology initiatives mentioned earlier, a GITS Board interagency Digital Government working group has been established. (Marty Wagner of GSA/GITSB has been nominated as the Applications Co-Chair.) A letter related to the Digital Government initiative will be coming to the agencies from OMB Director Raines. An agreement is being established to collaborate with the Federal Web Consortium.

The Digital Government Initiative is planning a series of workshops by the National Academy of Science on Digital Government and various application domains between now and October 1998. The workshops are to help develop the appropriate research agenda. Mr. Brandt will let CENDI know about the workshops as they are scheduled. Additional workshops can be held based on multi-agency proposals, particularly for the development of middleware and evaluation elements. In all cases, one-half to two-thirds of the funding must come from the participating agencies, and the remainder will come from NSF. The aim is to find program funding to the tune of $15M by FY 2000.

STINET: Public and Private Faces
Kurt Molholm, DTIC Administrator, and Huddy Haller, Chief of STINET Management Division

Mr. Molholm introduced the STINET system [http://www.dtic.mil/dtic/stinet1.html], which is available in both public and secure versions. STINET hosts a variety of services including journal Tables of Content from the British Library Document Supply Center and the Canadian Institute for Scientific and Technical Information (CISTI), AFCEA Corporate Sponsor Capabilities Directory, the Department of Defense Index of Specifications and Standards, and selected full text technical report documents. The public version includes 12 years of the Technical Reports database. Many of the features are the same, but the secure STINET has the added complexity of controlling distribution limitations and access. Mr. Molholm introduced Huddy Haller who described some of the advanced features of STINET.

The STINET uses Netscape's Catalog Server to access multiple resources in the DTIC domain. Full text documents are available based only on certain criteria. Eventually, hyperlinking will be included between documents to allow access to other collections. The link can be at the chapter level or from a citation within the DTIC database to the full text of the report. Unlimited/ unclassified full text information is being piloted, but limited information will also be included. Eventually, DoD work unit information summaries of ongoing research will also be available.

A new service is a personalized current awareness service called TRAIL (Technical Reports Awareness Internet Link). The user selects a particular subject pre-defined by DTIC and the resulting citations are e-mailed to the requester every two weeks. This feature is available on both the public and the secure STINET.

In addition to the pre-defined searches, registered (secure STINET) DTIC users can set up their own personal profiles for both technical reports and work unit information summaries. The user can also customize his own Web page from which to control his profiles. In addition, the citations are provided via e-mail.

STINET supports the Military Education Coordination Conference's (MECC) Partnership for Peace Information Management System (PIMS). PIMS contains information relevant to member countries, allies, and other non-NATO partners. You must have a password for this system. The libraries of the professional military schools are starting to play an important role in the MECC, because of the Z39.50 access to library catalogs. This allows searching and interlibrary loan between the schools. The student can access from home or from school.

Ms. Haller also discussed the DTIC migration to FULCRUM which has a Z39.50 server. Commercial WAIS software was bought by FULCRUM that makes the migration for their WAIS databases easier. It is anticipated that the migration of all of WAIS's databases will be completed by June 1998. The documents will be provided in a PDF format. The new system will have additional capabilities including in-line highlighting for search terms. By the end of January, DTIC will be posting unclassified unlimited technical report citations on SIPRNET, a DoD classified secret network.

Technology Navigator
Ed Dandar, ASD/C3I, and Jerry Cogle, Jr., MITRE

The Technology Navigator (TN)[technav@dtic.mil] is a combined project between COSPO, NAIC, CMO and DTIC. The primary contractor is MITRE, with marketing and conceptual operations support from Betac Corp. Technology Navigator's aim is to provide technology-related information across the government and to build bridges, both agency to agency and government to private sector. The focus is on information technologies and seven specific sensor technologies, with links to products, documents and companies. However, the technology and infrastructure behind the TN is expandable and scaleable beyond the current domains.

The conceptual design for the TN provides for information sharing across multiple user environments with different security restrictions. Currently, DTIC is hosting the Internet version available to industry, academia and other government users, and NAIC is hosting the private version for government users on the OSIS (Open Source Information System). Also, the second version of TN is going up on INTELINK-S (the secure, private network for the intelligence community). The information available on the DTIC and OSIS version is also available on the INTELINK-S version but with the addition of secure information that cannot be seen by the other two systems.

The TN was recently beta tested. During this time it received 14,000 hits. Forty percent of the hits were from commercial organizations, followed by DoD and foreign government addresses. The university use was low because the beta test was conducted over the summer months.

Another part of the project is automatic collection of relevant information This is done via customized searches. Seventy to eighty percent of the sites are collected automatically from the Web. An enhanced Automatic Web Collection facility is being explored, because the manual submissions process is, in practice, "very shaky." The originators lack incentives and time to perform manual collection and submission. A software agent is being explored to search the Internet and assign retrieved URLs to the technology topics defined. The first cut is to define search profiles for each area. Instead of using only one search engine, multiple engines, with different relevancy ranking algorithms are used, and duplicates are deleted in order to get better results. The software parses the URLs and provides the robot with the locations to which to go and extract the metadata from the HTML record of the source.

The automatic categorization is performed by taking the same profiles used in the Automatic Web Collection but using the Verity search engine. Those "hits" that result from the search are modified to include a particular subject category. This mapping to the TN concepts allows a document to be categorized to more than one category based on the originating strategy. However, MITRE has found that precision degrades with automatic categorization. The developers are working to make agents more precise in the way they search and to extract better metadata. Seventy percent of the documents make it through to the indexing process. Filtering and metadata extraction drop some irrelevant items. They have done limited manual clean-up after the automatic process.

The information repository for the TN is managed by Netscape Catalog Server. In some cases, the data is collected, reformatted and reloaded. In other cases, the data is collected and moved up to higher security levels. In many cases, only a pointer to the original is necessary. The TN is metadata based. The metadata record is extracted from the original source if the user does not provide metadata. The metadata is searched during the retrieval process.

The TN framework is designed to allow a consistent interface to all sites. The frame-based approach to the interface was added recently. This provides a consistent presentation of the tool bar, even when a long text document is retrieved. They are also involved with categories and taxonomies to classify the documents and sites. In the new version, MITRE has integrated the STINET queries within the framework of the Catalog Server so that the user never has to leave the TN interface. Mapping is required to do this with other databases. FIDUL has done an independent evaluation to improve the interface to TN and they are constantly changing and improving it. The new version of TN has a revamped tool bar.

Emphasis is placed on push technologies to get information to the desktop based on profiles. The new version of TN includes a profile editor that uses the search interface to create personal profiles and an option to save the profile for reuse. Documents are manually submitted during the day and imported into the system in the middle of the night. The profiles are then checked against these new documents. E-mails list the titles of the documents and contain hot links to them.

TN can be used to serve other Intranets. It has received various endorsements from within the DoD. The Joint Intelligence Virtual Architecture (JIVA) at DIA has selected TN as a tool. TIPSTER has added products and projects in the secure environment. There are currently 200 studies that are on the Internet and 800 more on TN for the Biological and Chemical Warfare IAC. James Madison and Georgia Tech are adding information. Information from CENDI agencies could be included under a "For Official Use Only" option.