Data to Manuscript in R

Welcome to the first quarter of Data to Manuscript in R!

D2M-R is designed to equip you with the tools necessary to conduct reproducible data analysis using R. D2M-R I in Winter 2026 will cover foundational R programming skills, data wrangling using the tidyverse, and foundational data reporting techniques.

The course structure is unusual, so please review all course policies and assessment details from the syllabus below:

Winter 2026 Details

Class Meetings

  • When: Tuesdays & Thursdays, 11am-12:20pm
  • Where: 1155 E 60th St, Room 289A

Professor

  • Dr. Natalie Dowling
  • Email: ndowling@uchicago.edu
  • GitHub: @nrdowling
  • Office Location: 1155 Building, Room 181
  • Office Hours: Thursday 2pm - 4pm

Teaching Assistant

  • Qilong Bi
  • Email: bql20@uchicago.edu
  • GitHub: @QilongBi
  • Office Hours Location: 1155 Building, Room 222
  • Office Hours: Tuesday 2-3:30pm

Hubs

View course policies, assessment, and other details in the syllabus.

Materials

Slides

Slides will typically be posted on Mondays and remain accessible through the end of the academic year. Links in the course calendar below will take you directly to the web presentation version of the slides.

You can alternatively find the Quarto files in the website’s repo, which include additional details and notes not visible in the presentation format.

Readings

D2M-R textbook chapters correspond to subjects in the textbook that goes with this class. This book is very much in development, with large portions entirely untouched as of the start of the quarter. Prioritize the assigned resources from external sources like R for Data Science.

You can contribute to any part of the textbook for course credit! See the student hub documents for textbook contributions for details.

Winter 2026 Calendar

Unit 1: Establishing a Workflow

Week 1 | Introduction to D2M-R

Jan-6; Jan-8

Class plans:

  • Tuesday: Introduction to the course, syllabus review
  • Thursday: Debugging & problem-solving strategies
  • Example Assignment - Tentatively, the plan here is:
    • Tuesday end of class: connect RStudio and GitHub, accept the assignment, and clone the repo
    • Before Thursday: figure out anything you didn’t get working on Tuesday, up to having the repo locally
    • Thursday end of class: complete the assignment, fill out the submission doc, push to GH, and create a pull request
    • Realistically, we may have little time to work on this in class. There are detailed instructions to get you through each step of this assignment on this site, other R learning sites, and in the files of the assignment itself. Whether or not we work on it in class, you should complete it before Tuesday Week 2.
    • See “Assignments & To-Do” below for details about the assignment itself.

Primary learning objectives:

    1. RStudio + Quarto workflow
    1. GitHub repositories and version control

Class Materials:

Downloads:

  • R
  • RStudio
  • Git
    • You don’t usually need to install Git separately, but you can

Additional Resources:

Graded Submissions:

  • Accountability Plan - Initial Submission
    • Due: Monday, January 12th, 2026
    • Submit to Canvas.
    • Create an individual accountability plan to follow throughout the quarter.
    • See the accountability page for details about this assignment.
    • We’ll talk about this on Thursday.

Recommended Exercises:

  • Example Assignment:
    • This is a crash course in using RStudio with GitHub, working in Quarto documents with markdown, and running simple R code within a Quarto document.
    • Use this GH classroom invite to access the assignment so you can figure out the process of cloning, editing, committing, pushing, and submitting assignments through GitHub and GH Classroom.
    • Between Tuesday and Thursday, you should get everything set up at least through have the repo cloned on your computer and accessible with an RStudio project.
    • Before Week 2 you should complete the rest of the assignment, using the problem-solving slides to help if you get stuck.
  • Guided Exercise: Create and Sync a GitHub Repo with RStudio
    • This exercise walks you through creating a GitHub repository and connecting it to an RStudio project.
    • This is similar to what you’ll do for the example assignment, but with more detailed instructions and without the logistical complications of GH Classroom.

Other To-do:

    • Minimally, send a message including your name and github username, but we’d also love to know a little bit about you and/or why you’re taking the class!

Reminders:

  • Initial accountability plans are due on Monday, January 12th. Look at the examples if you need inspiration. Remember that ultimately these plans are for you, not me, so they should be in a form that you’ll actually use. Make them fancy, make them minimal. Copy and paste directly from the examples, write a few sentences for each element, or make a full database with day by day planning. Don’t force yourself into a format that will be more overwhelming than functional.

Week 2 | Git & GitHub

Jan-13; Jan-15

Class plans:

  • Tuesday: Introduction to the Git & GitHub (mostly slides)
    • Now that you’ve got it set up, actually learning what it is.
  • Thursday: Finish slides if needed + hands-on practice with Git & GitHub
    • Create a repository, make changes, commit, push, pull, etc.
    • Q&A/demo as needed for RStudio and GitHub
  • Depending on need/interest, we can use Tuesday or Thursday to do a more involved RStudio and Quarto introduction.
    • That would make it a very dense week for slides trying to fit all of Git/GitHub into 1 day, but it’s doable.

Primary learning objectives:

    1. RStudio + Quarto workflow
    1. GitHub repositories and version control

Class Materials:

Additional Resources:

Next Steps: After this week you’re ready to dive into some repos.

  1. Make a repo and linked RStudio project for your integrated data project.
    • This is a private repo in your private account. Do not create the repo within the D2M-R organization. Add prof+TA as collaborators before submitting your first draft.
    • All you need at this point to get started is a name. Hopefully you have a plan for what data you’ll be using, which should be enough to form a sensible name. Though it’s generally not recommended if you can avoid it, you can change the name later (via GitHub settings) if needed, so there’s really no reason to wait on this.
    • Start by adding the essential files:
      • README.md and .gitignore can be created when you make the repo on GitHub or added later via RStudio
      • Create a manuscript Quarto file (e.g., manuscript.qmd) in the root directory.
      • Add a /data folder with any data files you have ready to go – be sure they are anonymized and safe to upload to a private repo first!
    • Add some content (real or placeholder) to your readme and manuscript files so you have something to commit and push. Experiment with simple markdown formatting in both files (e.g., headings, bold/italic text, lists, links, etc.) and add a code chunk to the manuscript file.
    • Remember the cardinal rules of Git: Pull when you sit down. Commit more than you think you should. Push when you stand up.
  2. Explore the Skeleton Repo core skills project. You can complete this right now to get points for objectives 1 & 2, or start it now and build it up over the next couple weeks to add points toward objectives 3-5 (or more).

Recommendation:

  1. Make a few human friends in class and form a study or accountability group. This will be a nice supplement your new rubber duck friend and open up options for group projects later.

Office Hours:

  • Dr. Dowling: Thurs 2-4pm, 1155 Bld. Room 181
  • Qilong Bi: Tues 2-3:30pm, 1155 Bld. Room 222
    • Open/drop-in

Reminders:

  • Accountability plans past-due
  • Accept the GitHub D2M-R organization invite
    • Requires accepting any GH classroom invite first
  • RStudio + Git setup: happygitwithr.com

Unit 2: R Programming Foundations

Week 3 | R Programming Language

Jan-20; Jan-22

Class plans:

  • Lecture: Fundamentals of R Programming Language
    • R syntax and structure
    • Functions and packages
    • Control flow - conditionals and loops
  • Exercises/demos:
    • Programmer’s Groceries
    • hello_world() example
    • Packages and Dependencies

Primary learning objectives:

    1. Base R syntax and data structures
    1. Control flow (if/else, loops)
    1. Defining functions in Base R

Class Materials:

Additional Resources:

Note: This week’s resources (both textbook and additional) include a lot of overlap of material. This is intentional. This content is probably the most essential of the whole course, so I want to offer multiple approaches and perspectives to teaching it. I’ve bolded the chapters I think are likely to be the most useful, but I suggest skimming through all of them to see which ones resonate best with you.

Next Steps:

  1. If you haven’t already, make a repo and linked RStudio project for your integrated data project.
    • This is a private repo in your private account. Do not create the repo within the D2M-R organization. Add prof+TA as collaborators before submitting your first draft.
    • Take a look at the to-do from last week for tips to get started.
  2. Explore the R Programming core projects. You can view them in the student hub or accept any as GH Classroom assignments via links on the menu.
    • adoption-day is a good one to start with if you want a guided introduction to writing functions in R.
    • hello-world is a version of the classic exercise to get you familiar with R syntax and function structure. It has clear and simple goals but less step-by-step instruction.
    • wrangling-function is more advanced, and it better suited to students who have some familiarity with base R and dplyr already. For most students, it will be better to come back to this one in Week 5 or later.
  3. Review the newly added suggestions for enrichment activities in the student hub. Most can be begun at any time, and several are best suited to continuing across at least a few weeks or the whole quarter.

Reminders:

  1. Keep track of your accountability plan! You won’t have a formal check-in with it until week 5, but it’s only useful if you actually use it between check-ins.

Week 4 | Tidyverse Essentials

Jan-27; Jan-29

Class plans:

  • Lecture: Welcome to the Tidyverse
    • Tidy data principles
    • Import & export with readr
    • Data manipulation with dplyr
    • Reshaping data with tidyr
  • Exercises/demos:
    • Read-in and -out practice with readr
    • dplyr pipeline demo
    • Combining data with tidyr

Primary learning objectives:

    1. Importing data with tidyverse tools
    1. Data manipulation with dplyr and pipelines
    1. Tidy data structure
    1. Reshaping data with tidyr

Class Materials:

Additional Resources:

Note: Resources from before ~2024 the %>% pipe, but everything is transferable to the native R pipe |>.

Next Steps:

  1. Explore the tidyverse core projects. You can view them in the student hub or accept any as GH Classroom assignments via links on the menu. After this week, you’re ready to start any of them, but most have elements that we won’t cover until weeks 6 & 7 (working with stringr and forcats).
    • clean-level-1-mtcars: Structured exercise to introduce data manipulation in the tidyverse. Primarily uses dplyr and tidyr, but includes some manipulation of strings and factors as well
    • clean-level-2-midwest: Open-ended with a general goal but no step-by-step instructions. Similar content to level 1.
    • clean-group: Group project where each member makes a clean dataset messy, then cleans another member’s messy dataset following tidy data principles.
    • wrangle-level-1-starwars: Structured exercise to go further into data wrangling. Includes a mix of dplyr, tidyr, stringr, and forcats.
    • wrangle-level-2-gapminder: Open-ended with a general goal but no step-by-step instructions. Similar content to level 1.
    • wrangle-group: (Similar to the group cleaning) Group project where each member un-wrangles a dataset, then wrangles another member’s dataset to as close to the original as possible.
    • recreate-function: A more advanced, freeform project where you pick a function from an R package and try to recreate it using tidyverse tools. Complex, and good for showing R programming & tidyverse objectives at a high level at the same time.
  2. Check in on your accountability plans. Submit your mid-quarter reflection and updated plan to Canvas by next Monday, February 2nd.

Announcements:

  1. GitHub Submission: I promised a screen recording of the github submission process, but have not had time to make it yet. I’ll make a Slack announcement when it’s ready. In the meantime, you can follow the extremely detailed written instructions.
    • Need help sooner? Ask the class on Slack! Some students have already figured it out and can give you guidance. Many students are still very confused, so if you’re the one to ask for help you’re doing lots of people a big favor.
  2. Submitted assignments: Qilong and I are double-grading early submissions to make sure we’re on the same page with grading standards. If you submitted early and haven’t received feedback yet, don’t worry – we’re working on it and will get to you as soon as we can!

Week 5 | Mid-quarter Review & R Workshops

Feb-3; Feb-5

Class plans:

  • Finish up tidyr slides
  • Independent and group work

Class Materials:

None

This weeks is for catching up on any missed work and getting ahead on the next few weeks.

Now is a good time to preview other packages from the tidyverse (ggplot2, forcats, stringr, etc.) and start working on projects that use these packages.

Reminders:

  • Did you turn in your accountability plan????

Unit 3: Data Wrangling & Reporting

Week 6 | Tidyverse Data Wrangling

Feb-10; Feb-12

Class plans:

Tuesday

  • Lecture: Tidyverse Wrangling with Strings
    • Basics of text data
    • String manipulation with base R
    • Comparable manipulation with stringr
    • Regular expressions (if time allows)
      • If we don’t get to it, work through the slides on your own.
  • Exercises/demos:
    • stringr demo

Thursday

  • Lecture: Tidyverse Wrangling with Factors
    • Basics of factors
    • Simple factor manipulation with base R
    • Advanced manipulation with forcats

Primary learning objectives:

    1. Data manipulation with dplyr and pipelines
    1. Tidy data structure
    1. String manipulation with stringr
    1. Factor manipulation with forcats

Class Materials:

Additional Resources:

Strings

Factors

Next Steps:

  1. SAME AS WEEK 4 – Explore the tidyverse core projects. You can view them in the student hub or accept any as GH Classroom assignments via links on the menu. After this week, you’re ready to start any of them, but most have have some tasks related to factor variables, which we’ll cover week 7.
    • clean-level-1-mtcars: Structured exercise to introduce data manipulation in the tidyverse. Primarily uses dplyr and tidyr, but includes some manipulation of strings and factors as well
    • clean-level-2-midwest: Open-ended with a general goal but no step-by-step instructions. Similar content to level 1.
    • clean-group: Group project where each member makes a clean dataset messy, then cleans another member’s messy dataset following tidy data principles.
    • wrangle-level-1-starwars: Structured exercise to go further into data wrangling. Includes a mix of dplyr, tidyr, stringr, and forcats.
    • wrangle-level-2-gapminder: Open-ended with a general goal but no step-by-step instructions. Similar content to level 1.
    • wrangle-group: (Similar to the group cleaning) Group project where each member un-wrangles a dataset, then wrangles another member’s dataset to as close to the original as possible.
    • recreate-function: A more advanced, freeform project where you pick a function from an R package and try to recreate it using tidyverse tools. Complex, and good for showing R programming & tidyverse objectives at a high level at the same time.
  2. Where is your data project? We’re more than half-way through the quarter.
    • By this point you should minimally have a dedicated GitHub repository for your data project that syncs with an RStudio project. It should include the standard git repo files and structure (like your README and gitignore), data in the form of a .csv or other tablar file, and a Quarto notebook with a basic outline of the project.
    • Ideally you also have one or more R scripts that do things like define functions and perform initial data read-in, wrangling, and write-out.
    • If you haven’t started yet, this needs to be a top priority.
    • If you have started, consider submitting whatever you have for feedback. Remember you absolutely do not have to have a complete project to submit (I strongly recommend you do not wait nearly that long!). Indicate what objectives you’re attempting and get clear feedback on whether you’re meeting expectations (and if not, how to do so). Even if you’re only attempting to demonstrate a few objectives, there’s no harm in getting guidance on what you have so far.
  1. GitHub Submission: This is now a screen recording of the github submission process you can view here extremely detailed written instructions.
  2. Hunter’s Plain-English Guide to R: Hunter has turned his notes into a Quarto document and allowed me to share it on the website. It’s a great resource for beginners and a nice complement to the textbook and lecture materials. You can find it here.
  3. Example data project: As I mentioned last week, I’m setting up an example data project that shows how to meet the basic requirements for the integrated data project. I have initialized that repo in the d2m-r github org, and you can view it here.
    • Note: As of Feb 8 2026, this repo is not complete. It does not currently meet the requirements for the integrative data project! I’ll be updating it as I have time, but I wanted to share it in its current state in case it’s helpful to anyone as a starting point. I’ll make an announcement when it’s complete.

Week 7 | Data Reporting, Part 1

Feb-17; Feb-19

Class plans: TBD!

Depending on class interest, we may cover some combination of:

  • Intro to Data Viz & ggplot2
  • Large- and/or small-group workshopping of integrative data projects
  • Crash course in reporting with a “results memo” example & template
  • More stringr and forcats practice
  • Intro to psychologists’ most used descriptive and inferential statistics in R

Primary learning objectives:

    1. Data manipulation with dplyr and pipelines
    1. Basic reporting: plots and descriptives in Quarto

Class Materials:

This is fully dependent on what we cover in class, but here are some tentative materials we may use:

Additional Resources:

  • I will add some links once I have a better sense of how these next 2 weeks will go.

Graded Submissions:

Recommended Exercises:

Announcements:

Reminders:

Week 8 | Data Reporting, Part 2

Feb-24; Feb-26

Class plans:

We’re playing it by ear for the last two weeks due to the cancellation of D2M-R II next quarter. In general, I’m opening things up to the class to decide what would be most useful in two respects:

  1. What do you need to get what you wanted out of this quarter’s material?
  2. What would be helpful to preview out of next quarter’s materials?

There is an ongoing poll on the Slack Announcements channel to vote on #2. Summarized:

  1. Data analysis: descriptive stats
  2. Data analysis: hypothesis testing
  3. Reporting: publication-quality tables
  4. Reporting: publication-quality ggplots
  5. Reporting: dynamic code and value reference
  6. Publication: more markdown & quarto features
  7. Publication: bibtex, inline citations, references pages
  8. Publication: apaquarto extension

During week 8, we will definitely:

  1. Review ggplot2 basics
  2. Introduce overviews of common descriptive statistics and hypothesis testing

Primary learning objectives: May vary, but most likely…

    1. RStudio + Quarto workflow
    1. Data manipulation with dplyr and pipelines
    1. Factors with forcats
    1. Basic reporting: plots and descriptives in Quarto

Class Materials:

Additional Resources:

  • readings
  • cheat-sheets
  • tutorials

Next steps:

  1. Complete core objectives projects, keeping in mind the goal you set for yourself in your accountability plan.
  2. Submit your final accountability check-in by next Monday, February 23.
  3. Submit a draft of your IDP soon to ensure timely feedback for another revision.

Announcements:

D2M-R II will no longer be offered in Spring 2026. This is due to an unfortunate, unavoidable, and unpredictable scheduling conflict that arose very recently. I’m making every effort to be available to any students who had intended to take D2M-R II next quarter, including (if there is interest) a working group for continued development of data reporting skills using R and Quarto.

Week 9 | Data Wrangling & Reporting Workshops

Mar-3; Mar-5

Class plans:

We’re playing it by ear for the last two weeks due to the cancellation of D2M-R II next quarter. In general, I’m opening things up to the class to decide what would be most useful in two respects:

  1. What do you need to get what you wanted out of this quarter’s material?
  2. What would be helpful to preview out of next quarter’s materials?

There is an ongoing poll on the Slack Announcements channel to vote on #2. Summarized:

  1. Data analysis: descriptive stats
  2. Data analysis: hypothesis testing
  3. Reporting: publication-quality tables
  4. Reporting: publication-quality ggplots
  5. Reporting: dynamic code and value reference
  6. Publication: more markdown & quarto features
  7. Publication: bibtex, inline citations, references pages
  8. Publication: apaquarto extension

During week 9, we will definitely:

  1. Have dedicated time to work on IDPs, including large-group workshops, small group collaboration, and/or individual work time
  2. Introduce basics of BibTeX and citations
  3. Review end-of-quarter logistics and IDP expectations

Primary learning objectives: May vary, but most likely…

    1. RStudio + Quarto workflow
    1. Data manipulation with dplyr and pipelines
    1. Basic reporting: plots and descriptives in Quarto

Class Materials:

Additional Resources:

Next steps:

It’s the end of the quarter!! Here are the deadlines you need to know (“me”/“I” = Dr. Dowling):

  1. Tuesday, March 3rd
    • Deadline for IDP drafts (for feedback)
      • This is a bit of a soft deadline, and I will continue to grade drafts as they come in if I am able to. There is simply a practical limitation here, where I have a queue of projects to grade and a limited amount of time to do so. Consider also that you’ll need to revise after my feedback.
    • Deadline to request pass/fail or incomplete
      • If you have not already arranged a plan, email me by Tuesday, then we discuss a plan over email and finalize by the end of the week. If we already made a plan earlier in the quarter, you’re good to go.
  2. Friday, March 6th
    • Deadline for core objectives projects
      • You may request extensions directly with Qilong. His ability to grant extensions depends on the number of projects that come in before the deadline, and whether to grant an extension or not is entirely at Qilong’s discretion. Do not contact Dr. Dowling to request a core project extension.
    • Deadline for enrichment projects
      • You may request extensions by emailing Dr. Dowling. I will be as flexible with extensions as I am able. Basically this comes down to how many IDP drafts and final submissions are in my queue, which will take priority.
      • Submissions granted exceptions may not receive written comments.
    • Deadline to request extensions for IDPs
      • With the flexibility I offer for submissions across the board, I have to draw a line somewhere in order to ensure I can submit grades to the registrar on time. As always, I will grant as many requests as I can, but there will come a point where I simply cannot accommodate any more.
    • Deadline to finalize plans for pass/fail or incomplete
      • We will discuss what this means after you contact me earlier in the week.
  3. Tuesday, March 10th
    • FINAL DEADLINE FOR INTEGRATIVE DATA PROJECTS
      • Whatever version you have submitted at the end of the day will be the version that gets the final grade. If you submitted a late draft that I didn’t get to and then submitted an update, I will grade the most recent version only.
    • Community engagement self-evaluation due
      • Canvas submission of a brief reflection and suggested grade out of 10 points. More detailed instructions on Canvas.
  4. Tuesday, March 17th
    • All grades due to the registrar. There’s nothing you need to do as a student here, but I just want to be transparent about grading and extension timelines. Remember that instructional work for any given class is just one part of the responsibilities of your professors and TAs. We have other classes, research, advising, and admin work (on top of general being-a-human stuff), and even the most generous late policies have to have limits at the end of the quarter. Also, when you don’t see grades posted until the last minute, it’s never (well, rarely) because your professor wants to torture you. We are literally grading through the last minute.

Announcements:

D2M-R II will no longer be offered in Spring 2026. This is due to an unfortunate, unavoidable, and unpredictable scheduling conflict that arose very recently. I’m making every effort to be available to any students who had intended to take D2M-R II next quarter, including (if there is interest) a working group for continued development of data reporting skills using R and Quarto.

If you are interested in joining a working group/workshop in the Spring to continue getting support on thesis projects or other ongoing work with R, RStudio, or other topics related to the class, please reach out to let me know what kind of group you’d be interested in.