CENDI ANNUAL PLANNING MEETING

Bolger Conference Center
Potomac, MD
September 4-5, 2003

Abbreviated Minutes

 

Following is a summary of the keynote speeches presented by Bonnie Lawlor, NFAIS; Jose Griffiths, University of Pittsburgh; and Timothy Sprehe, Sprehe Information Management Associates.

 

NEW STRATEGIES AND OPERATING MODELS FOR GOVERNMENT STI MANAGERS: E-GOVERNMENT AND E-SCIENCE

"Battle for Mindshare: Challenges for the Information Industry"
Bonnie Lawlor, Executive Director, NFAIS (National Federation of Abstracting and Information Services)

Ms. Lawlor addressed the shifting power and the development of a new information infrastructure. She focused on the factors and the challenges for information providers. NFAIS began in the 1950s as the country was seeking to move science and technology forward. It began with 14 organizations, including 5 government agencies.

NFAIS members adopted computer technology when it was virtually unknown. Through innovative information management and search and retrieval systems, such as those at NASA and DoD, these early adopters laid the groundwork for electronic publishing. A byproduct of computerization – databases -- eventually became the products themselves and were available online as early as 1972. Since that time, the industry has been adapting to new media such as CD-ROM and to the Web.

While there is pride in what the traditional information industry has accomplished, Ms. Lawlor senses fear and concern in the rapidly changing environment. As of September 3, 2003, 3.3 billion Google searches had been performed. Google has stated that it wants to organize the world’s information. NFAIS plans to interview Google about their plans for handling scholarly information. Several members of CENDI indicated that they are already working with Google to make their deep database content more accessible to the surface web.

How did we come to this point of asking where the traditional information industry is going and who will do our functions in the future? In the past 45 years, while the traditional information industry was busy becoming more efficient, there were changes underway to which it did not really pay attention. The user was outside the circle of our traditional infrastructure. We talked among ourselves and not necessarily outside. Now, we are no longer perceived as an integral middleman in the process.

It is a combination of factors that has caused the change in the environment. These factors include technology, economics, politics and policy.

It took 20 years to reach 60 million online searches; it required mediation because it was complex, slow, and expensive. However, this mediated searching raised the awareness of information among certain user groups. The first IBM Personal Computer (PC) was sold in 1980 and by 2001 there were 589 million units installed. Personal computers entered the home, and young people were quick to become PC users because of the appeal of game software. The number of households with PCs was 10 percent in 1984 and is now up to 70 percent in 2003. Twenty-two percent of households have more than one PC. Soon after the initial introduction of home computers, the Internet and the Web grew as well. In 1993, there were 130 web hosts. Now there are 171.6 million. The growing use of technology among adults is an important trend; almost 60 percent of the adult usage is for work, not just for entertainment and shopping.

A second factor is economics; in particular, the rise in journal prices. A Mellon Foundation Study found a nine percent annual growth rate in prices. As of May 3, 2003, there were 170,000 serial titles. At the same time, there is an average decrease of six percent annually in library budgets. (There is a more dramatic decrease among state universities where the budget cuts are deeper.) There are more individual journal titles but fewer subscribers among whom the costs can be shared. The library community is paying three times more money than in 1986 for eight percent fewer titles.

Political and policy factors also contribute to the current environment. Since the 1991 Feist versus Rural Telephone Service Company, libraries have been fighting for broader interpretation of fair use and publishers have been arguing to make it narrow. Copyright, which used to be an esoteric term, is now in the minds of the public because of NAPSTER. Copyright may not be well understood, and the young, vocal general public, who will be the lawyers shaping intellectual property in the future, don’t want restrictions on use.

Google was in the right place at the right time, but there are other factors besides Google. There will be many more Googles to follow since these new players are reflective of the combination of factors discussed above. It is important to follow all these factors and how they play out in the marketplace today and into the future.

Ms. Lawlor then reflected on the current behavior among students and faculty and on the comments received from consumers in the marketplace. The use of traditional resources in libraries or even in library portals has drastically declined. A study of Medical Student Behavior in 2000 showed that six percent have never used MEDLINE or any traditional service, even though these services are available on the web.

The user, not the information provider, determines the value. The movement from traditional to less traditional sources will depend on the discipline and the individual user needs and preferences. William Arms of Cornell University recently studied scholarly research on the web. He concludes that Google is no substitute for traditional products. However, services such as Inspec might be in trouble, because computer science information is openly accessible, often appearing on the web before it appears in journals. As the indexing and search engines on the web become better, the users’ perceptions will change, and no one should be comfortable. Even now, the value of these less-than-perfect services is that they bring scholarly information to the common user, creating valuable opportunities.

A variety of products and services are spurred by these factors. Users want more relevant data but more diverse resources, including videos and interactive spreadsheets. However, users also want more relevant information. They want minimal overlap across services, with full coverage services like ISI’s Web of Science (Thomson Institute for Scientific Information). Data mining capabilities are driving requests for digitization of legacy information. Among ACS journal users, there is interest in pre-1907 information.

The factors are influencing the business models. Libraries and end users want to purchase only what they want, rather than having materials bundled together. New business models such as iTunes are held up as examples. The open access movement, spurred by economic factors, has gained worldwide acceptance. Electronic publishing has made the open access movement possible. Because of the technology, electronic dissemination is no longer an impediment and the majority of the researchers prefer to access information from their desktops. In 1991, there were 687 electronic serials, only seven of which were peer reviewed. By 2003, there were 5000 electronic journals that had no print counterpart.

What is the ideal system for end users? That it is easy to access, easy to use, available “24/7” (around the clock), offers a broad spectrum of resources, and that it is reliable, pleasurable, and reasonably priced. The web fulfills the above criteria according to user perceptions, but traditional abstracting and indexing (A&I) services do not.

The challenges for the traditional industry are visibility, content access and retrieval methods, and the business models. Changes among the traditional services are slow in coming. Cybercash systems are slowed because publishers are concerned about revenue streams and changing business models. Some publishers are collaborating with library web sites. For example, Elsevier just published a booklet on how to get visibility, but it is through the traditional library channels. The emphasis must be on services (both free and fee-based) rather than on content. Users want to feel that they are being helped and so some level of consultancy- or customer-based service is necessary.

Obviously, having a web presence is key, especially a presence that gets you on the results lists of the major search engines. Google is indexing the content of The New England Journal of Medicine and IEEE (Institute of Electrical and Electronics Engineers). However, the effort required to have Google index your deep web content is not trivial.

Multimedia has to be incorporated into databases. The static nature of text is not enough anymore. Open Access scholarly journals will have to be incorporated into traditional A&I services in the future. More linking to full text and related non-traditional information will be needed. New uses for legacy data, particularly in the areas of data and text mining, will be necessary. Most use models don’t really allow this right now, but corporations, particularly in the chemistry and pharmaceutical areas, would like to take advantage of these large repositories of data and information.

Data visualization, voice recognition, and adaptation to new information “appliances” are necessary. Fifteen percent of physicians in the US use handheld devices. By the end of 2005, 71 percent of web access will be through some sort of portable device.

Business models will continue to be an issue. Traditional models must be viewed against individual, customized models such as iTunes, or “bating” with free collections such as that at the British Library. Revenue streams and pricing will need to be carefully evaluated.

The big question is “what are you doing to secure the future?” Current A&I strategies include linking, collaboration, automation, and increased amount and diversity of content. Elsevier, for example, says that its purpose is no longer to build the best engineering database but to build the best platform. Publishers are banding together to provide information across their repositories.

Many traditional A&I services still do not understand that the environment is changing. However, the A&I services have the skill sets to aid in the organization of this information, but it is a matter of deciding to “go after it,” keeping in mind the factors that are forcing the changes.

"Panel on E-government and E-science"
E-government, Dr. Timothy Sprehe

Dr. Sprehe, previously of OMB and author of A-130, asked the question, “Will E-government work?” In his opinion, E-government works when agencies have already perceived and accepted the logic of e-government even before the blessing of the current Administration. Some agencies, particularly those in the STI community, were already “doing” E-government.

The speed and ease with which E-government initiatives are implemented and their ultimate success depends on several factors. The easiest are those that are non-interactive (the information is just posted), and involve only a single agency with a well-defined constituency that is not expecting much interaction. These types of activities probably result in fewer FOIA requests, and the impact on printing and records management activities can be seen. Success is less likely and the development is more difficult when multiple agencies are involved, interaction is sought, or there is a diverse and ill-defined audience. Complex, interactive E-government services are harder, because many of the services depend on the perception of the user with regard to privacy, security and other concerns.

Multi-agency initiatives are still workable when the agencies have preexisting relationships and think of themselves as a community. FedStats is a positive example of this. On the other hand, Biz.gov had a champion under the Clinton Administration in the reinventing government initiative, but when that person left, the service died because there was no pre-existing community and it was difficult for the agencies to see the benefit. This is likely with the E-rule Making initiative, which involves ten agencies. Dr. Sprehe questioned whether such a combined system would actually increase the public participation in rule making.

E-government is less likely to work when E-government initiatives do not have real money. The most recent cut in funding resulted in only $5 million for FY04. Dr. Sprehe suggested that the low budget priority indicates that E-government itself is no longer a priority. This could explain some of the resignations in OMB.

E-government initiatives often require substantial back office systems to make them work. This is especially true for those systems that are transactional and two-way. There is seldom adequate funding made available to upgrade these systems and fully support the E-government activities.
The Government Paperwork Elimination Act requires only that agencies accept electronic information submitted by the public, but it doesn’t mention automating the processes. Therefore, there are examples of electronic forms being printed out to re-key them. GPEA assumes electronic signature capability on the part of the agencies. GPEA also sets no standards or deadlines for agencies to respond to the public. It is flawed because it only takes the first steps but provides no plan for moving forward.

Will E-government survive? The change in leadership at OMB means loss of a central coordinating force. It is likely that E-government will be downplayed during this election year because it really doesn’t gain votes, so Congress isn’t likely to follow through with the money. The problem of the appropriations and oversight committees has raised concerns about the degree to which OMB had funding oversight. It is possible that Congress will come back with specific agency appropriations.

E-science, Dr. Jose Marie Griffiths

Dr. Griffiths spoke from her experience as former CIO at the University of Michigan where they supported the scientific environment and from her current investigations into the scientific enterprise, scientific publishing, and use of scientific information. She has recently been looking at the readiness of individuals and groups to take advantage of the technologies and identifying at what point the technologies are really taken into the usage routine of individuals.

E-science suffers from the same issues as E-government. However, to some extent, the science community is very different. There are significant changes in science, including more scientists, especially in the commercial sector. Science is “bigger”, more collaborative, global, more mission driven, more competitive for resources and attention, and more susceptible to political influence over the decisions that are made. At Pittsburgh, where Ms. Griffiths is the Doreen E. Boyce Chair & Professor at the School of Information Sciences, they now talk to doctoral students about the political aspects of science as well as the ethical ones.

Changes are also underway in scientific institutions. Of R&D, there is less research and more development. The focus of the institutions is on efficiency, productivity, and total quality management. Risk aversion is predominant in many academic institutions. There is more directed funding from the outside which limits the serendipity. The increase in outsourcing influences the collaborations particularly across boundaries.

All this results in increased pressure on scientists. It is important for them to get up to speed very quickly. There is less publishing, especially in the private sector, but they take advantage of what has been published from the public sector. (The number of times a paper is cited by others is not necessarily a good metric; the actual usage is much higher, because information from a paper is shared among groups and colleagues in the same institution.) The academic sector is often publishing for tenure purposes. The primary feeling, in many cases, is “why should we give away our ownership of the intellectual property?” This raises the interest in open access and institutional repository development.

Among academics, there is great pressure to commercialize their findings or to consult. These activities cover administrative costs and bring money into the institution.

There are also changes in the process of doing science. The amount of information is vastly outweighing our ability to analyze it. Scientists don’t know where to begin analyzing these massive terabytes of information, particularly from instrument and clinical information. The information is often disaggregated. This disaggregation indicates a critical need to capture and store data/information and knowledge, to identify it, retrieve it, analyze, synthesize, visualize, explore and mine, collaborate and then communicate it.

Unfortunately, the web has become a catchword in science, and it is being used as if it provides access to all the information that ever existed. Scientists are taking advantage of the web, but, at the same time, they are overwhelmed by the amount of information and the lack of organization. The invisible web would be extremely helpful to them, but the knowledge of the wealth of important information available there is shallow among both faculty and students. The next generation’s understanding is actually quite limited.

Technology, on the other hand, has opened up our ability to capture and analyze information. A front-end exploratory tool is needed that would help scientists approach the vast amount of data as opposed to the deep approach that we have always taken.

Collaboration is changing science. The upper-atmospheric physicists whom Dr. Griffiths and her colleagues have studied include approximately 1500 researchers. The group collaborates on the web to plan data collection activities. They create models in anticipation of what they will see, and then they turn on all their instruments. In this virtual laboratory, the physicists are able to remotely control them and to see the data coming in over a period of days. They can look at the progression and see it analyzed and rendered in real time. The data is then examined against the models. The researchers chat about the results online, immediately informing their theoretical ideas and forming the basis for the next set of mutual experiments.

The researchers observed the atmospheric physicists at the University of Michigan, who serve as a hub for this activity. The members of the group are beginning to feel comfortable with the collaborative environment to the point where the experimentalists and theoreticians have become a cohesive cybercommunity. This change in the “doing” of science results in a shrinking of the theoretical cycles.

E-science changes how new people join a community. For example, because of the online nature of the cybercommunity, it is easier for new scientists to join in the discussion. It has also changed how they think about their science, and they begin to expect desktop science. This has transferred over to their expectation of how digitally published content should be conveyed to them. While people like the availability of digital library collections, they continue to use the library and traditional publications. The researchers expected that the physicists would have made the change earlier with the Los Alamos Archive, but this wasn’t the case. Instead, the researchers found that it has to do with the number of people who are comfortable working in this cyber environment. This principle will have an impact for the development and adoption of both e-science and e-government.

The push from the user community is for technology to solve the problem. They want immediate, customized results that are aggregated across resources and pushed to their desktop. One might say that this is done through portal technologies. However, most of the portal technology is very passive.

With this type of collaborative environment, there is more emphasis on communication. The content that the group accumulates, often on a web site, includes the data and the analyses. These often require version control and effectively make up the group’s documentation. The result becomes a whole trail and sequence of the analytical output, suggesting that the future of documenting science is more than just creating the base publication. Such a trend will then drive the need for tools to be deposited and documented as well. However, issues of intellectual property in software are stalling this activity.

What impact will E-science have on the management of scientific and technical information? There will be more content (both in terms of volume and format types) and an important question is what is kept for the long term. The federation of content objects into aggregates requires more and better quality metadata. There will be a move from information to knowledge management and knowledge sharing. Knowledge management activities won’t be perceived as successful if they are viewed as mere extensions of information dissemination. It needs to be a culture of knowledge sharing and knowing. If this is achieved, the need for capture and management of the information will flow from it. Currently, we are focusing on the management of artifacts, rather than on what people are trying or need to do.

There is increased pressure for accountability without real definition of what this means. Our environments are so complex and everything is quite different from how it was just a few years ago. How do you create metrics that allow you to be accountable? How do we know when technology is really making a difference? What benefits can you accrue? There will be an imperative on the part of STI organizations for leadership rather than management, and vision and understanding are needed to bring forces together to make the vision a reality.

In this environment, it is important to understand the capabilities and limitations that are brought about by technologies; organizational, sector and discipline structures; economic and legal frameworks; individual group and organizational behaviors; and by the Internet culture in general. It is a complex environment that is not easily understood or changed. We need to understand the value propositions of e-science. Unfortunately, it is often the case that analytical efforts, consulting, and reengineering of processes are more an exercise in justifying what exists than in truly reengineering.

Key issues for STI programs include ensuring enduring access to the information; developing a culture of knowing; dealing with intellectual property rights and responsibilities, and making sure that this is visible to users of the content; rethinking the common good; balancing openness with security; and educating scientists, managers, support personnel, and funders in new ways of doing things. Abstracting and indexing has a legitimate role to play in providing a trusted aggregation of information.

Standards development and adoption should be accelerated. However, standardization brings vulnerability. People often develop standards with self-interest in the outcome. It is important to get involved in standards development, though Dr. Griffiths noted that involvement in the NISO standards activities has eroded.

We need to improve the skills of scientists. Most NSF-funded work never really gets out for general use, because people aren’t ready, they don’t have the skills, and it is too uncomfortable for them to use it.

Return to Minutes Archive