Collaboratory@Columbia to Embed Data Literacy Throughout Curriculum

The seminar in Fayerweather Hall required a new four-letter code in the directory of classes—HSAM—because it combines disciplines that are rarely taught in tandem, history and applied mathematics. On a recent Tuesday, the discussion was about interpreting and analyzing data in polls, surveys and social media.

February 17, 2017

“Think about who’s generating the survey, who’s funding it and what they want,” Chris Wiggins, associate professor of applied mathematics in Columbia Engineering School, told the students. “It’s difficult to do statistical analysis that’s not poisoned by one’s own beliefs.” He co-teaches the class, “Data: Past, Present and Future,” with Matthew L. Jones, the James R. Barker Professor of Contemporary Civilization in the history department.

The course is among the first from the Collaboratory@Columbia, an initiative jointly founded last year by Columbia’s Data Science Institute and Columbia Entrepreneurship to embed data literacy in Columbia’s curriculum by pairing data scientists with professors in other fields to develop and teach new courses. Wiggins (CC’93), chief data scientist for The New York Times, is affiliated with the Data Science Institute. Jones, whose specialty is the history of science and technology, studied data science, computer law and privacy law with a 2012-2015 fellowship from the Andrew W. Mellon Foundation.

“Our class aims to combine an introduction to statistical and computational reasoning with reflection on the ethical and political ramifications,” said Jones, who is working on a book about data collection and analysis by government intelligence agencies. “Data is not a natural thing, there are systems that produce it. We have to equip people to be critical about the limitations and possibilities of every data set, wherever it comes from.”

Richard Witten (CC’75), the founder of Columbia Entrepreneurship, said inspiration for the Collaboratory grew from conversations with friends and associates working in law and finance. “I asked them, on a scale of 1 to 10, how important technology is to them, and the answers were all seven, eight or nine,” recalled Witten, who is also a special advisor to University President Lee C. Bollinger and former vice chairman of the board of trustees. “Then, when I asked how proficient they are in technology, the numbers I got were all two, three or four. It is incumbent on us as a university that’s training future leaders to offer courses in technology contextualized so we don’t have this gap going forward. We can be a leader.”

The Collaboratory Fellows Fund supports the development of curricula to teach technical and critical thinking skills that students will need for careers in academia, government, business and other fields with grants up to $150,000. A multidisciplinary faculty committee reviews proposals—typically submitted by two professors—and the provost makes final decisions. The Collaboratory expects to fund at least 18 projects over the next three years.

The first projects, approved last spring, include a collaboration between the Journalism School and the Graduate School of Architecture, Planning and Preservation to develop a course on geographic information systems, spatial analysis and web-based mapping for student journalists, and one on story-telling techniques for architecture students.

The Mailman School of Public Health is developing a course on how to use big data. The Business School and Columbia Engineering are developing a technology curriculum that will include some existing courses. The School of International and Public Affairs and the computer science department are building on an existing course to develop a class in data science for public policy.

Many students who are not computer science majors take data science courses, but the Collaboratory is unusual in its interdisciplinary approach, said Patricia J. Culligan, the Robert A.W. and Christine S. Carleton Professor of Civil Engineering and associate director of the Data Science Institute. “There is interest and demand among students who don’t want to be data scientists but want to understand how the data revolution is transforming their fields,” she said.

Other Collaboratory initiatives included a boot camp in January for Ph.D. students and post-docs interested in data science; more than 100 people applied for 30 places. A clinic in Butler Library to help faculty and students with data issues is planned for the spring.

“We asked schools and departments what training they thought their students needed—students in history, in public health, in policy,” Culligan said. “What’s going to come out of the Collaboratory is a really unique set of educational offerings that eventually will be taken up by other universities.”

The seminar that Wiggins and Jones teach has 19 students, more than half of them majors or double majors in computer science. The professors hope to attract a larger and more varied group in the fall, and Jones hopes the course will eventually fulfill part of Columbia College’s science requirement.

“We’re attempting to do this very challenging thing in science education, which is making a course appealing and interesting both to students who are more technically and technologically advanced and those who are less technologically advanced,” Jones said. “It's about things that all students, all citizens, should know about data,” Wiggins added.