Introduction
PDI’s internal models are available in each account and included on all polling samples.
These models are what we consider “functional models,” intended to provide for greater targeting of voters by any campaign. PDI models (like most models) are best when used to supplement or refine other available data or used when there is no other data available. For example if you are trying to identify possible degrees of partisanship or Ideology you should start the actual party of those voters that you want to further distinguish, such as including only INDEPENDENT voters or excluding all DemPlus and RepPlus voters, then use the partisanship or ideology model score to segment or separate those remaining voters who’s partisanship is unknown.
PDI Ideology Liberal / Conservative
This model identifies how conservative or liberal a voter is based on a number of factors. The analysis begins with the survey responses, but then also includes elements such as the actual party registration, household partisan composition using PDI's 27 different household party type codes, and precinct level election outcomes in key past elections.
Category | Count | Percentage (%) |
Very Liberal | 962,040 | 4.47% |
Liberal | 7,522,419 | 34.99% |
Moderate | 7,856,581 | 36.54% |
Conservative | 4,821,438 | 22.43% |
Very Conservative | 337,617 | 1.57% |
To use this model in the PDI, select a score from 1-100, where 1 is the most liberal, and 100 is the most conservative. As can be seen in the following chart, the median result is a more liberal voter, however there is a second hump in the data showing a strong population of conservative, but not very conservative voters. A score below 42 represents someone classified as a liberal, and a score over 72 represents someone classified as conservative. Within the range of 42-72, where voters are categorized as moderate, scores from 42 and 57 will lean liberal and scores from 57-72 will lean conservative.
PDI Partisanship of Independent Voters
The current PDI system includes party registration, and a set of party descriptions called DemPlus and RepPlus which can be used to capture both those party registrants, and people who have donated to, pulled ballots for, or previously been registered Democratic (DemPlus) or Republican (RepPlus).
This model builds on the DemPlus and RepPlus by allowing campaigns to target independents who model to primarily vote with Democrats or Republicans.
This model uses some of the same building blocks, but then adds household partisan makeup, ethnicity, age, registration date and voter surveys in which independent voters were asked if they primarily sided with Democrats or Republicans.
Category | Count | Percentage (%) |
Mostly or Always Democrats | 9,672,898 | 42.67% |
Usually through Mostly Democrats | 1,887,644 | 8.33% |
True Swing Voter | 3,046,306 | 13.44% |
Usually through Mostly Republicans | 2,594,097 | 11.44% |
Mostly or Always Republicans | 5,467,880 | 24.12% |
PDI Children in Household/Likely Parent
To use this model in the PDI, select all scores greater than 50 for households with school aged children under 18, and all scores less than 50 for households without children under 18. The closer the scores are to the extremes, the more confidence in children in (or out) of household there should be. As a binary model, we recommend using 50 as a cut point. You can also just use the “Likely Parents” option found in the Demographics tab of the PDI to select the same voters.
PDI Support for Abortion/Choice (2023-2024)
This probabilistic model measures how likely someone is to respond that they support Pro-Choice policies on a survey. Importantly, scores on either extreme do not necessarily reflect an intensity of support, but rather our measure of confidence of our knowledge of the voter’s propensity to support Pro-Choice policies. For example, scores of 75 are not necessarily more pro-choice than scores of 60, we simply have more confidence that they are pro-choice. As a cutoff for messaging, we recommend starting with a cut point at 50 to exclude anti-abortion voters for paid media outreach.
Deep dive into model scores
On a technical level, all of our models are built in a similar way. Since 2016, PDI has continuously run large email surveys via the PDI Emailing module and the SurveyMonkey platform, giving us access to over 200,000 voter-file matched responses. We then take the relevant questions from the surveys, along with the respondents corresponding voter file data, and fit machine learning models to figure out which variables are the most predictive to their responses. After validating these models on a holdout sample of survey response (i.e., some survey responses that are intentionally not included into the modeling process in order to figure out how predictive these models are), we then score these model on the entire voter file.
In order to understand how these models work, it’s useful to crack them open and look at individual examples. Using a methodology developed by researchers at the University of Washington known as SHAP, we can examine the individual “contribution” of each variable to each individual’s score. For an illustrated example, here’s a plot explaining the author’s partisanship score.
This seems fairly obvious -- the author is a registered Democrat, who lives with another Democrat, in a dense urban city. We have pretty high confidence this is a Liberal!