Effective solutions to problems of sustainability require collaboration among scientists and researchers in the physical and social sciences, but information-sharing across disciplines can be problematic. Now an Illinois team, led by Professor Praveen Kumar, will help to develop better methods for data-sharing in the new and growing field of sustainability. The work is part of a cooperative effort by four major universities, led by the School of Information at the University of Michigan and funded by the National Science Foundation (NSF) with a two-year, $2 million grant.
The grant will enable researchers at Illinois, Michigan, Indiana University and Rensselaer Polytechnic Institute to develop a system whereby sustainability scientists can manage and share their data. Sustainable Environment–Actionable Data (SEAD) will provide tools for the active curation of data and engage researchers using social networking.
Understanding fundamental principles for sustainable societies requires access to large amounts of data on natural phenomena, human behavior, and economics, according to Margaret Hedstrom, a professor in Michigan’s School of Information and principal investigator for the project.
“To date these data have been difficult to obtain and use because disciplines across the natural and social sciences collect, describe, and store their data in different ways,” Hedstrom says. “The data could have significant value if it were possible to connect the data collectors with potential users, and if it were easy for individuals to search for, aggregate, and maintain valuable data for the long term.”
The team at Illinois will provide the sustainability science expertise on the project, said, co-PI and Lovell Professor of Civil and Environmental Engineering at Illinois. Kumar will lead the Domain Science Engagement Committee, which will help to ensure that the SEAD prototype meets the needs of the user community and has impact on sustainability science.
“We will help define and prioritize the data needs for understanding the coupled human-nature interaction for advancing sustainability science,” Kumar said. “Specifically, we will engage researchers and data producers in the definition of requirements, use cases, and data models, in prioritizing data for curation and preservation, and in the design and evaluation of standards and tools for data management so that they can more easily share, integrate, analyze, and reuse data within their community.
“The data for physical science has very different character than that for social science, yet critical questions in sustainability lie at the intersection of these two domains. SEAD will allow us to address such questions by providing a common platform for these and other types of data.”
An important goal of the project is to demonstrate the use of a variety of technologies in the development of a system that helps researchers manage their data and motivates them to share it with others, Hedstrom said.
“SEAD will employ social networking technologies similar to those used on popular sites such as YouTube and Flickr to facilitate connections between scientists,” she said.
In the first two years of this project, the team will work closely with scientists in sustainable land use, water quality, urban planning, and agriculture in the upper Great Lakes and Mississippi River Basin.
“For years, sustainability science researchers have lacked a way to manage heterogeneous data over the long-term,” Kumar said. “Moreover, the value of research that results from combining observations and measurements from multiple datasets to create a new dataset is often lost, as there is no easy way to share this data. SEAD will help us insure these novel data sets are available for reuse. This can have far-reaching benefits, aiding scientists and policy makers in the areas of natural resource management, agriculture, energy, economic development, and related areas to make better decisions.”
Advances in data cyberinfrastructure are key to advancing sustainability research, said Beth Plale, co-PI and director of the Data to Insight Center at Indiana University.
“Our project brings the research libraries to the table as what we consider to be a key piece of the solution of long-term preservation of this important asset,” Plale said.
Director of the supercomputing center at Rensselaer Polytechnic Institute, co-PI Jim Myers is building the software infrastructure to support a network of repositories that will function on multiple levels.
“One of the aspects of SEAD I find most exciting is that we’ll be developing and delivering infrastructure that really couples active research and long-term preservation of important reference data to a degree that hasn’t been done before,” Myers said. “I believe that coupling will prove to be tremendously powerful and will ultimately have a dramatic impact on the pace of academic and industrial research and on the scope and scale of research projects that can be tackled.”
A central component of the project is its application to education, training, and outreach, said Ann Zimmerman, co-PI and research assistant professor at Michigan.
“The post-doctoral students participating in the SEAD project will be prepared to assume leadership roles in scientific data management.” Case studies developed during domain engagement activities will be utilized in graduate education courses at the four universities. A dedicated website and user workshops will extend the reach of the project and provide platforms for feedback and participation.
“SEAD will help address national goals to sustain and improve our environment,” said Hedstrom. “By developing methodology for investigators to collocate and easily access scientific data, we will be making real and vitally important contributions to scientists who grapple with the many environmental issues that confront us today.”
The initial NSF grant is for $2 million over two years, with an additional $1.5 million committed in year three.