Clean, Tidy Data Sets
The data sets below are ready to visualize and model. They are tidy. I filtered
out useful subsets. I selected and descriptively renamed the most
important variables. I meaningfully reordered factors.
gamson |
[.csv ] [.dta ] [.rds ] [.xlsx ] |
[html] |
From Warwick and Druckman (2006) |
health |
[.csv ] [.dta ] [.rds ]
[.xlsx ] |
html |
From Barrilleaux and Rainey (2014). |
nominate |
[.csv ] [.dta ] [.rds ] [.xlsx ] |
[html] |
Ideology scores for members of the U.S. House. |
parties |
[.csv ] [.dta ] [.rds ] [.xlsx ] |
[html] |
From Clark and Golder (2006) . |
Original Data Sources
- For replication data sets, I recommend starting
with the Dataverse archives for AJPS [web and
PSRM.
- For really raw data for wrangling
practice, I recommend (ordered from least to most difficult) Donald
Trump’s tweets [GitHub],
the World Bank’s World Development Indicators [GitHub], Google political ads
data [web,
Dropbox],
or 10 million dyadic events [Dataverse].
- For data on international politics, I recommend COW
[web], DESTA [web], and
Matt Fuhrmann’s data on nuclear weapons [web].
- For data on political institutions, I recommend
Polity IV [web], DD [web],
Freedom House [web], and
DES [web].
- For US state politics, I recommend the Correlates
of State Policy data set [web],
which combines variables from many projects in single, enormous
collection.
- For data on legislator ideology, I recommend
NOMINATE [web] and the American
Legislatures Project [web].
- For data on human rights, I recommend Human Rights
Scores [web], PTS [web], CIRI [web],
and ITT [web].
- For survey data, I recommend the ANES [web], CCES [web], CSES [web], and the World Values
Survey [web].
- For data from randomized experiments, I recommend
TESS [web].
- Google now has a search for data
sets.
- Let’s go meta: PolData is a data set
of data sets.
Carlisle Rainey