Published on November 4, 2008
The Stage Is Set for the Future CENDI September 2007 Walter Warnick Director, Office of Scientific and Technical Information U. S. Department of Energy
To advance science and sustain technological creativity by making R&D findings available to Department of Energy researchers and the American public Information fuels discovery Superior access to quality information speeds discovery OSTI Mission
Advancing Science Discovery: From the ’40s to the Future 60-year anniversary Sept. 18 Whether by print or by pixel, OSTI has long been committed to ensuring appropriate access to research results From 1947 to 2007 – from Nuclear Science Abstracts to WorldWideScience.org – mission accomplished!
OSTI’s creation 60 years ago signified a sea change from the Secret City of the Manhattan Project toward an openness to share S&T knowledge with the public. Of course, we continue to this day to keep secret all the information that has military applications. Whether peaceful or national security related, the S&T legacy of this agency is captured in this building.
OSTI corollary: If the sharing of knowledge – or knowledge diffusion – is accelerated, scientific progress is accelerated Science Progresses as Knowledge Is Shared Science can be advanced by hiring more researchers and giving them better equipment; and science can be advanced by accelerating the sharing of knowledge
We consciously seek to exploit new technology to accelerate the spread of scientific and technical knowledge.
Larry Page, speaking to scientists, AAAS 2007 He called on the scientists to make more of their research available digitally. “We have to unlock the wealth of scientific knowledge and get it to everyone.” "Virtually all economic growth (in the world) was due to technological progress. I think as a society we're not really paying attention to that.”
The stage is set for the future We are ready to scale up our efforts in metasearch, or federated search. Simply put, we intend to make science searchable via one portal.
Google : v., to search for information through Google Googleable : adj., information found by Googling Non-Googleable : adj., information that cannot be found by Googling We must ensure access to science information that is Non-Googleable
Most useful information is available via familiar search engines such as Google and Yahoo! True or False? The vast majority of science information in databases is not crawled by popular search engines
Google “crawls” the surface Web, but scientific databases are largely found in the deep Web Scientific databases stump Google Surface Web Systems that crawl the Web do not typically reach below the surface
Google moves ahead with plan to open up federal Web sites Google is making strides on an initiative to make information stored on public government Web sites more accessible to people looking for it, but challenges remain, officials with the search engine company said Wednesday. Three federal organizations recently agreed to structure their sites to make them accessible for nearly all Internet searches, the officials said. Information on the Plain Language Web site aimed at eliminating jargon in government communications, and on sites by the Energy Department's Office of Scientific and Technical Information and the Education Department's National Center for Education Statistics, has been opened up to the three most popular search engines: Google, Yahoo and MSN. Google works to solve the problem, but there’s a better way …
Federated search drills down to the deep Web where scientific databases reside. Unlike the Google solution, federated search places no burden on the database owners. We need systems that probe the deep Web Deep Web databases Surface Web
Federated search yields one-stop portals 200 million pages 50 million pages Key DOE databases 19 sources, 17 countries, all inhabited continents
Harvesting Analogous to Google – crawls and mines data that does not reside in databases but … Different from Google – directed, selective crawling Harvesting and federated search are useful when full bibliographical control is not feasible Examples
Current, real-time results No burden for database owner Inexpensive to implement No need-to-know for user No searching door-to-door Allows for fielded searching Interoperability is automatically achieved Federated Search: Advantages
Current, real-time results
No burden for database owner
Inexpensive to implement
No need-to-know for user
No searching door-to-door
Allows for fielded searching
Interoperability is automatically achieved
Federated search has limitations Neither crawling nor federated search is a panacea Federated searching does things crawling cannot do, and vise versa. They are complementary technologies Federated searching has advanced rapidly and should continue to advance Additional Points
Federated search has limitations
Neither crawling nor federated search is a panacea
Federated searching does things crawling cannot do, and vise versa. They are complementary technologies
Federated searching has advanced rapidly and should continue to advance
Science as a noble enterprise