Data GIS Blog
How do you support 57,860 online students learning R and statistics ? Late last fall, Data and GIS Services shared this challenge with Professor Mine Çetinkaya-Rundel and the staff of CIT as we sought to translate Professor Çetinkaya-Rundel’s successful Statistics 101 course to a Coursera class on Data Analysis and Statistical Inference. While Data and GIS Services has supported Statistics 101 students for several years identifying appropriate data and using the R statistical language for their assignments, the scale of the Coursera course introduced new challenges of trying to provide engaging data to a very large audience without having the opportunity to provide direct support to everyone in the class.
In our initial meetings with Professor Çetinkaya-Rundel, she requested that Data and GIS create data collections for the course that would provide easy access in R and would include a range of statistical measures that would appeal to the diverse audience in the class. The first challenge — easy access to R — required some translation work. While R excels in its flexibility, graphics, and statistical power, it lacks some of the built in data documentation features present in other statistical packages. This project prompted Data and GIS to reconsider how to provide documentation and pre-formatted R data to an audience that would likely be unfamiliar with R and data documentation.
The second challenge — finding data that covered a wide range of interesting topics — proved much easier. The General Social Survey with its diverse and engaging questions on a wide range of topics proved to be an easy choice for the class. The American National Election Studies, also offered a diverse set of measures of public opinion that suited the course well. With these challenges identified and addressed, we spent the end of 2013 selecting portions of the data for class (subsetting), abridging the data documentation for instructional use, and transforming the data to address its usage in an online setting (processing missing values for R, creating factor variables).
As Professor Çetinkaya-Rundel’s class launches on February 17th, this project has given us a new appreciation of providing data and statistical services in a MOOC while also building course materials that we are using in Statistics 101 at Duke. While students begin the Coursera course on Data Analysis and Statistical Inference, students in Professor Kari Lock Morgan’s Statistics 101 class will use these data in their on-campus Duke course as well. We hope that both collections will reduce some of the technological hurdles that often confront courses using R as well as improving statistical literacy at Duke and beyond.
Confused about Data & GIS Services? Not sure what questions you should be asking us or what kind of services we provide? Here’s one handy chart we’ve come up with to explain what exactly we cover in our consultations and workshops.
When it comes to picking what day to stop by our walk-in hours or knowing how much of the data life cycle our consultants cover, this graphic might be your first stop. Whether it’s finding data, processing or analyzing that data, or mapping and visualizing that data, we have staff with expertise to help!
Still not sure who to approach or what kind of help you might need? Just email firstname.lastname@example.org to get in touch with all of us at once. Some questions can be answered quickly over email, but we’re also happy to schedule an appointment to talk in person.
Explore network analysis, text mining, online mapping, data visualization, and statistics in our spring 2014 workshop series. Our workshops provide a chance to explore new tools or refresh your memory on effective strategies for managing digital research. Interested in keeping up to date with workshops and events in Data and GIS? Subscribe to the dgs-announce listserv or follow us on Twitter (@duke_data).
Currently Scheduled WorkshopsThu, Jan 9 2:00 PM – 3:30 PM Data Management Plans – Grants, Strategies, and Considerations Mon, Jan 13 2:00 PM – 3:30 PM Webinar: Social Science Data Management and Curation Mon, Jan 13 3:00 PM – 4:00 PM Google Fusion Tables Tue, Jan 14 3:00 PM – 4:00 PM Open (aka Google) Refine Wed, Jan 15 1:00 PM – 3:00 PM Stata for Research Thu, Jan 16 3:00 PM – 5:00 PM Analysis with R Tue, Jan 21 1:00 PM – 3:00 PM Introduction to ArcGIS Wed, Jan 22 1:00 PM – 3:00 PM ArcGIS Online Wed, Jan 22 3:00 PM – 4:00 PM Open (aka Google) Refine Mon, Jan 27 2:00 PM – 3:30 PM Introduction to Text Analysis Wed, Jan 29 1:00 PM – 3:00 PM Analysis with R Thu, Jan 30 2:00 PM – 4:00 PM Stata for Research Mon, Feb 3 1:00 PM – 2:00 PM Data Visualization on the Web Mon, Feb 3 2:00 PM – 3:00 PM Data Visualization on the Web (Advanced) Tue, Feb 11 2:00 PM – 4:00 PM Using Gephi for Network Analysis and Visualization Wed, Feb 12 1:00 PM – 3:00 PM Introduction to ArcGIS Tue, Feb 18 2:00 PM – 3:30 PM Introduction to Tableau Public 8 Tue, Feb 25 1:00 PM – 3:00 PM ArcGIS Online Thu, Feb 27 1:00 PM – 3:00 PM Historical GIS Mon, Mar 3 2:00 PM – 3:30 PM Designing Academic Figures and Posters Tue, Mar 4 1:00 PM – 3:00 PM Useful R Packages: Extensions for Data Analysis, Management, and Visualization