Week 4: Average,SD, and Normal Approximation

Conceptual Homework

Read FPP, chapter 4, and do the assigned exercises.

Read my notes on measures of location and scale. They review some of the material from FPP, but also cover other single-number summaries and computation. Update: Be sure to notice the group_by() and summarize() functions in the notes and please reproduce the examples in the notes. These are critical for the computational homework below. You might find these two lecture videos helpful: (1) a video describing how group_by() and summarize() work and (2). a video with a live coding example of group_by() and summarize().

Read FPP, chapter 5, and do the assigned exercises.

Read and review my notes on the normal curve, which has a few more mathematical and computational details. Do the review exercise at the end that asks about the extremity of Congressional leadership. Here’s a PDF version that might be more readable.

Make sure to turn in three things:

The assigned review exercises from ch. 4 of FPP.
The assigned review exercises from ch. 5 of FPP.
The leadership extremity exercise.

Computational Homework

Starting this week, I’m giving you a lot of flexibility on the computational assignments. Your task this week: learn something interesting from your data using the conceptual and computational tools we learned in class (i.e., measures of location and scale).

conceptual tools: average, SD, median, IQR, MAD, and related quantities capturing location and scale.
computational tools: group_by() and summarize() and misc. geoms such as geom_line().

You will do a simple data analysis and write a short paper summarizing your results. Feel free to use additional tools in your toolbox, such as histograms and data wrangling, or learn new tools as needed, but you must feature some of the tools we learned this week.

IMPORTANT: As you work through the steps below, feel free to copy files over from previous weeks’ assignments.

Prepare.
1. Open RStudio and start a new RStudio Project called hw04 and initialize git.
2. Open GitHub Desktop, add the local repo hw04-first-last, and publish the initial files up to GitHub as a part of the pos5737 organization.
3. Create a data/, doc/, and R/, subdirectories of hw04.
4. Save the raw version of your data set to data/.
Wrangle, summarize, and plot. In thoroughly commented R scripts, do the following:
1. Wrangle the data into a clean, tidy data frame. You should have already done most or all of this work, but continue to make improvements as you see the opportunity or need.
2. Compute conceptually/theoretically/normatively meaningful summaries of the location (average, median) and/or dispersion (SD, IQR, MAD) for conceptually/theoretically/normatively meaningful subgroups (men/women, 7-point party identification, democracy/autocracy, etc.).
3. Plot the summaries. Make it easy for the reader to see the relevant comparisons. Experiment to find the best way to compare the groups of interest. You might use lines, colors, sizes, location, facets, and so forth. Focus first on find an approach that makes the comparisons easy. Then worry about the cosmetics.
4. Create a nicely formatted table of the averages (and/or SDs, medians, IQRs, MADs, etc.). a You might find the the kableExtra package. Note that you might need to wrangle the summaries into a more appropriate structure for tables, especially using tidyr::pivot_wider(). a I like to save tables as a separate .tex file and \input{} it into the LaTeX manuscript (you probably need \usepackage{float} and \usepackage{booktabs} at the top of your .tex manuscript). The tables repo contains an example R script that makes a table and a LaTeX document that \input{}s the table. (There are tutorials for Word, if you prefer that.)
Write a short paper explaining your results.. You may use LaTeX (recommended), Microsoft Word, or whatever you like to compose the document, but please include the PDF and keep it synced with the .tex or .docx source file. You should organize your work following the standard political science format, but please cover the following at some point.
1. Write enough so that your reader understands how to interpret the key figure(s) and table(s).
2. Spend some time explaining why someone should care about your point. (In my experience, this is both the hardest and most valuable part of a paper.)
3. If you are able, explain how your paper fits with existing literature. There’s no need to read extra, but if you are aware of work on the topic, work to incorporate it.
4. Helpful tip: You should never write a LaTeX document from scratch. Instead, you should start with a template. I always just copy over my .tex manuscript from a previous project and start replacing the text. You should do something similar. I made two templates for beginners (simple .tex and .Rmd examples and and example with a Makefile). You should make sure you can get these example .tex files to compile and then slowly make changes. If you run into trouble, please ask for help.
Develop the README. Organize your README in a helpful way. You probably should include a summary of the basic argument, a link to the PDF of the manuscript, perhaps a figure or table or two in the README. When designing/updating your README, keep the goal in mind: to quickly bring the user up-to-speed with the project and help them understand and evaluate and extend your work.

When you are done, please make sure that you have completed all the steps above. Then submit your link to Canvas.