How To Become Nate Silver In 9 Simple Steps

Nate Silver has built one of the best models out there. It's accurate, consistent, and totally scientific. 

One advantage of being totally scientific is that his model can be replicated.

Silver, over the past five years, has explained in depth how his model actually works.

Looking through that, we now have a step-by step-process on how to be Nate Silver. 

Here's what you'll need:

  • A copy of Microsoft Excel
  • A definitive source of campaign finance data and election data
  • A vast polling database with 15 years of impeccably accurate information.

So the first of these supplies can be picked up at your local computer store and several online resources.

The last one may be difficult to find — as early as 2010 Silver was working from a database of " 4,670 distinct polls from 264 distinct pollsters covering 869 distinct electoral contests" —  but once you have it ready to go you're all set to start your FiveThirtyEight model. 

Step One: Collect all the recent polls that you can find from within a state. We'll label each of these polls by a variable like P1, P2, P3, or just generally speaking Pi

Step Two: Figure out the poll weights. Now, we need to find the weighting for each of the polls — w1, w2, w3, or more generally wi — and that's not exactly an easy task. Each wi is made up of three different values. The "recency" of the poll (we'll call this Ri), the sample size of the poll (Ni for each Pi) and the Pollster Rating for the company that did the poll (Qi for each Pi).

  • Ri is expressed as an exponential decay function. The older the poll gets, the lower this number gets. If the same polling company releases a newer poll, Ri also goes down.  We know that in 2008, the Ri was expressed at the "half-life" of the poll. 
  • Ni has to do with the sample size. The more people that the poll Pi sampled, the higher Ni gets. There are diminishing returns, though, so the rate that Ni grows slows as the sample size gets larger and larger. 
  • The Pollster rating requires the use of that massive database of polling we mentioned earlier. This analyzes the accuracy of the poll in retrospect. Better pollsters have bigger values of Qi. 

Step Three: Get Qi.  The Pollster rating, Qi, has to be determined for each pollster before we can move forward. The ways Silver does this is rather complex, but the pollster ratings are the signature element of his model so this is important. 

The way Sliver goes about developing these ratings is outlined in this blog post from 2010, but here's the gist:

The point is to see how a pollster does compared to the mean. It's much easier to call a presidential contest (avg. error 2.8 points) than a gubernatorial primary (avg. error 7.8 points). 

First, Run a regression for races where each pollster has a dummy variable. Other variables should include recency for each poll, sample size, and dummy variables abut the race. The subsequent weight that is found for each pollster dummy value is the measured skill and is called "raw score."

Next, we need to figure out a value called the Reversion parameter. If "n" is the number of previous polls from the pollster in the sample:

reversionparameter = 1 - (0.06 * sqrt(n))

Next, we regress these raw score ratings toward the mean to account for inherent luck and variance and noise. Pollsters in the National Council on Public Polls or the AAPOR Transparency initiative are considered better than non-members

Here are the values of a variable called "groupmean" that we'll need for the final calculation:

  • NCPP or the AAPOR Transparency Initiative members: -0.50.
  • Polls by telephone: +0.26
  • Polling by means of the Internet: +1.04.
Here's how we get the adjusted score:

adjscore = (rawscore * (1 - reversionparameter)) + (groupmean * reversionparameter)

Negative numbers are good pollsters, positive ones are bad. Finally, Silver calculates Pollster-induced Error. PIE= 2 + adjscore. The minimum PIE is 0, and for our purposes Qi is PIE. 

Step Four: Get the weighted polling average.  Silver doesn't say how he combines these values Ri, Ni, and Qi to form Wi, and by now that's most of what remains of his "secret sauce." We'll just multiply them. 

Now, we have a weight Wi for each poll Pi. Take the average for each state of all the weighted polls. Each pollster has a "House Effect," or a demonstrable partisan skew as compared to the running poll average. For many, this isn't a major factor, but the House effect for each poll (Hi) has to be factored in.

When n is the number of polls in the average, the Weighted Polling Average is equal to ∑[(Wi•Pi)+Hi)/ n. 

Next, do a national trend line adjustment to the weighted polling average. Nationally, if the generic Democrat has lost two points, factor that into the Weighted Polling Average. 

Finally, do a likely voter adjustment to the likely-voter adjusted weighted polling average. Nate's version is simple, just add a certain percentage to the party that is usually expected to get more votes by the expected percentage. 

This, incidentally, is the Five Thirty Eight Weighted Polling average you'll see in each Senate race and state presidential race. 

Step Five: The FiveThirtyEight Regression. Silver creates one additional "poll" that gets averaged into the mix. 

This "poll" is actually the result of a regression that brings in the major ground effects of the race — candidate experience, partisan tilt of the state, incumbency stats, et cetera.

Silver has already figured out the coefficients based on long time study of prior races, like the marginal quantitative advantage that a Governor has over a House Representative in a Senate election.

You'll have to figure that out on your own by running regressions of historical data, just like Silver presumably did. Once you do figure out those coefficients (A, B, C...), push them into a regression that looks something like this:

LR = Ax1 + Bx2 + Cx3 + Dx 4  + Ex5 + Fx6 

 LR is the result of this poll, and the xi values correspond to these variables:

  • x1, the partisan voting index of the state. This is probably taken from Cook Political Report, and is a number describing the average margin of victory for a generic Republican candidate over a generic Democrat. In Virginia, for example, a generic Republican would beat a generic Democrat by 1 point, so the PVI would be 1. 
  • x2, the Party Identification in the state. This value compares the number of registered Dems to the number of registered GOP members.
  • x3, a number (or numbers) describing fundraising data for each candidate
  • x4, a binary variable for incumbency. 
  • x5, a number describing the approval rating of an incumbent, if there is one.
  • x6, a series of dummy variables describing "stature." For senators and governors this is 3, for Representatives, Attorney Generals and big-city mayors this is 2, for State-level offices this is 1, for no prior experience this is 0. 
LR then goes into the average as a poll. It's weighted as a well-regarded poll, with a weight of 0.6. Once factored into the average, this becomes the FiveThirtyEight State Fundamentals snapshot. This is known hereinafter as the "projected vote split"
Step Six: Error analysis. The error for the snapshot projection is determined from variables based on prior FiveThirtyEight projections. Here are the ones that Silver specifically refers to as important:
  • The error is higher in races with fewer polls
  • The error is higher in races where the polls disagree with one another.
  • The error is higher when there are a larger number of undecided voters.
  • The error is higher when the margin between the two candidates is lopsided.
  • The error is higher the further one is from Election Day.

After finishing this, we have two statistics: The projected vote split and the standard error. 

Step Seven: Prepare for simulation. 

So now, we want to run around 100,000 simulations to figure out what happens next. 

We have projected vote split and standard error for each of the 50 states for our model. 

We need to split up the error into two types: National Error and Local Error. We know total error from Step Seven when we calculated the standard error.

National error is calculated from a historical analysis of poll changes between the date of the analysis and election day as well as general changes, we'll call this NE.

Here's the relationship between Local, National, and Total error:

Local Error = √ [(Total Error)2 — (National Error)2]

Step Eight: Simulate Once. Do you have Excel ready? Great. We now calculate values for Local and National error based on a normal distribution, using Microsoft Excel's NORMINV(rand(), mu , sigma) function. 

For national error, the value of mu is equal to the observed NE and sigma is equal to the standard deviation observed when calculating NE. 

In each simulation, NE is the same for each state. But, in east state we calculate local error.

  • To calculate Total Error, run NORMINV with mu equal to to the standard error developed in Step Six for each state and the related sigma.
  • National Error comes from NE.  

Then, plug these values into the Local Error formula to get local error

So now, you have a row in excel with (a) National error and (b) 50 local errors, one for each state. Take the Local Error for each state and combine it with the related projected vote split that you got in step six. This is this simulation's result, state by state

Step Nine: Repeat step eight 100,000 times, and aggregate. 

That's the first simulation. Now, do the same thing 100,000 times. In each simulation, you'll find out which party won which states.

You can then use this to figure out the electoral vote count in each simulation. Once you have 100,000 simulated electoral vote counts, average them. This is your FiveThirtyEight projection. 

For each state, you can figure out the average margin of victory. This will show you the odds for each state. You can also program a row to figure out which state was the first to push someone over 270, which will allow you to calculate the Tipping Point states. 

Now, the key thing to becoming Nate Silver is this: Make sure to do this every day for years on end and add insightful daily commentary responding to different changes in the model that you notice after coding for hours on end. Once you manage that, you can contact Mr. Sulzberger and ask when to pick up your paycheck. 

Now see five statistics problems that will change the way you see the world >



More From Business Insider

--

Get stories like this on the Yahoo app and discover more every day.
Download it now.
Loading...

Explore Related Contents

  • Asia stocks ease, dollar steadies after Fed-led losses
    Asia stocks ease, dollar steadies after Fed-led losses Reuters - 2 hours 8 minutes ago

    By Nichola Saminather SINGAPORE (Reuters) - Asian stocks pulled back from a 19-month high on Thursday, while the dollar tried to steady from losses suffered in the wake of the U.S. Federal Reserve minutes ... … More »

  • Solar Capital meets 4Q profit forecasts Associated Press - 2 hours 23 minutes ago

    On a per-share basis, the New York-based company said it had profit of 42 cents. The results met Wall Street expectations. The average estimate of four analysts surveyed by Zacks Investment Research was ... … More »

  • SP Plus posts 4Q profit Associated Press - 2 hours 23 minutes ago

    The Chicago-based company said it had profit of 43 cents per share. Earnings, adjusted for one-time gains and costs, came to 52 cents per share. The parking facility management company posted revenue of ... … More »

  • U.S. oil rises after report shows drop in stockpiles
    U.S. oil rises after report shows drop in stockpiles Reuters - 2 hours 25 minutes ago

    TOKYO (Reuters) - U.S. oil futures rose nearly 1 percent on Thursday after data released by an industry group showed a surprise decline in U.S. crude stocks as imports fell, lending support to the view ... … More »

  • CoreLogic beats Street 4Q forecasts Associated Press - 2 hours 27 minutes ago

    On a per-share basis, the Irvine, California-based company said it had profit of 4 cents. Earnings, adjusted for one-time gains and costs, were 56 cents per share. The results exceeded Wall Street expectations. ... … More »

  • Tesla swings to loss in 4Q, says Model 3 on target
    Tesla swings to loss in 4Q, says Model 3 on target Associated Press - 2 hours 32 minutes ago

    Unable to string together profitable quarters, electric car and solar cell maker Tesla Inc. reported a loss for the last three months of 2016. Tesla posted its first profit in three years in last year's ... … More »

  • UK car production reaches nine-year high in January Reuters - 2 hours 32 minutes ago

    British car production rose by an annual 7.5 percent in January to hit its highest since 2008, as strong demand for exports compensated for a decline in demand at home, an industry body said on Thursday. Vauxhall's Ellesmere Port plant in northern … More »

  • Tokyo stocks open lower AFP - 2 hours 39 minutes ago

    Tokyo stocks opened lower Thursday as the yen strengthened against the dollar and investors contemplated the minutes from the US Federal Reserve's latest meeting. The benchmark Nikkei 225 index slipped ... … More »

  • Alleghany posts 4Q profit Associated Press - 2 hours 41 minutes ago

    The New York-based company said it had profit of $4.50 per share. Earnings, adjusted for non-recurring costs, came to $7.02 per share. The property and casualty insurance provider posted revenue of $1.46 ... … More »

  • Top banks' commodities revenue drops 7 percent in 2016 - survey
    Top banks' commodities revenue drops 7 percent in 2016 - survey Reuters - 2 hours 45 minutes ago

    LONDON (Reuters) - Commodities-related revenue at the 12 biggest investment banks fell by 7 percent last year, mainly due to weakness in the oil sector, a report by financial industry analytics firm Coalition ... … More »

  • Lower fares hit Qantas first half profit
    Lower fares hit Qantas first half profit AFP News - 2 hours 50 minutes ago

    Australian carrier Qantas Airways posted Thursday a 7.5 percent hit to first half earnings, impacted by lower fares, greater competition and empty seats. Underlying profit before tax fell to Aus$852 million (US$656 million) for the six months … More »

  • Fitbit reports 4Q loss Associated Press - 2 hours 51 minutes ago

    On a per-share basis, the San Francisco-based company said it had a loss of 65 cents. Losses, adjusted for one-time gains and costs, came to 56 cents per share. The results did not meet Wall Street expectations. ... … More »

  • Lower fares hit Qantas first half profit AFP - 2 hours 52 minutes ago

    Australian carrier Qantas Airways posted Thursday a 7.5 percent hit to first half earnings, impacted by lower fares, greater competition and empty seats. Underlying profit before tax fell to Aus$852 million ... … More »

  • Air NZ flies into turbulence as competition heats up
    Air NZ flies into turbulence as competition heats up AFP News - 2 hours 54 minutes ago

    Air New Zealand reported a 24 percent fall in interim net profit Thursday amid increased competition in the domestic aviation market. New Zealand's flag carrier said net profit for the six months to December 31 was NZ$256 million (US$184 million), … More »

  • 3 Reasons Why You Should Consider Buying Gold Now
    3 Reasons Why You Should Consider Buying Gold Now ValuePenguin - 2 hours 57 minutes ago

    This is because gold is perceived as an "alternate" currency whose value does not depreciate along with your ordinary paper money. While commodities as a whole have faced a bear market in recent years — the Bloomberg Commodity Index, which tracks … More »

  • The good times are back for copper. Here’s why
    The good times are back for copper. Here’s why Truewealth Publishing - 2 hours 58 minutes ago

    Not long ago, Dr. Copper was sick. Today the doctor is looking a lot healthier. It’s been a rough few years for copper. Since hitting all-time highs of US$10,250/metric tonne in February 2011, it has suffered nearly 6 years of tumbling prices. As … More »

  • OPEC Still Waiting for Evidence Oil Cuts Are Doing Their Job Bloomberg - 2 hours 58 minutes ago

    A reduction in the amount of oil held in storage around the world is the most important factor for the Organization of Petroleum Exporting Countries, Qatar’s Energy Minister Mohammed Al Sada said at the IP Week conference in London Wednesday. The … More »

  • Nissan appoints Hiroto Saikawa as CEO Reuters - 3 hours ago

    (Reuters) - Nissan Motor Co Ltd said on Wednesday it has appointed the company's co-chief executive officer, Hiroto Saikawa, as Nissan's chief executive, effective April 1. Carlos Ghosn, chairman of the ... … More »

  • Tesla Stays Steady With Model 3 as Musk Loses Latest Executive Bloomberg - 3 hours ago

    The Model 3 sedan, Tesla’s cheapest plug-in vehicle yet, remains on track to arrive in July, and production of the car should reach about 5,000 units by the end of the year, according to a letter to shareholders Wednesday related to the company’s … More »

  • Tredegar posts 4Q profit Associated Press - 3 hours ago

    On a per-share basis, the Richmond, Virginia-based company said it had net income of 5 cents. Earnings, adjusted for non-recurring costs, were 10 cents per share. The plastic films maker posted revenue ... … More »

  • Ansys beats Street 4Q forecasts Associated Press - 3 hours ago

    The Canonsburg, Pennsylvania-based company said it had profit of 80 cents per share. Earnings, adjusted for one-time gains and costs, came to 98 cents per share. The results topped Wall Street expectations. ... … More »

  • Cenveo reports 4Q loss Associated Press - 3 hours ago

    The Stamford, Connecticut-based company said it had a loss of 3 cents per share. Earnings, adjusted for one-time gains and costs, were 3 cents per share. The printing and packaging company posted revenue ... … More »

  • Global stocks fade from record highs, dollar falls on Fed minutes
    Global stocks fade from record highs, dollar falls on Fed minutes Reuters - 3 hours ago

    By Richard Leong NEW YORK (Reuters) - Global stocks pulled back from record highs on Wednesday while the dollar declined after minutes from the U.S. Federal Reserve offered little to support the notion ... … More »

  • Tesla has finished its investigation into Fremont plant working conditions and will release findings soon
    Tesla has finished its investigation into Fremont plant working conditions and will release findings soon Business Insider - 3 hours ago

    Scott Olson/Getty Images Elon Musk. Tesla CEO Elon … Continued The post Tesla has finished its investigation into Fremont plant working conditions and will release findings soon appeared first on Business Insider. … More »

  • Activist Glaucus Travels 10,000 Miles to Find Next Big Short (1)
    Activist Glaucus Travels 10,000 Miles to Find Next Big Short (1) Bloomberg - 3 hours ago

    The co-founder of Glaucus Research Group flew more than 10,000 miles (16,090 kilometers) from his base in Austin, Texas, to Australia, where he’s looking at companies including commodity producers. From his new firm in London, former Goldman Sachs … More »

 
Recent Quotes
Symbol Price Change % Chg 
Your most recently viewed tickers will automatically show up here if you type a ticker in the "Enter symbol/company" at the bottom of this module.
You need to enable your browser cookies to view your most recent quotes.
 
Sign-in to view quotes in your portfolios.

MARKET MOVERS

  • Most Actives
    Most Actives
    NamePriceChange% Chg
    0.260.00+1.96%
    N21.SI
    1.015+0.035+3.57%
    G13.SI
    1.73+0.20+12.70%
    S51.SI
    0.20-0.02-9.09%
    SK3.SI
    0.375+0.015+4.17%
    5ME.SI
  • % Gainers
    % Gainers
    NamePriceChange% Chg
    0.35+0.07+27.27%
    BQF.SI
    0.13+0.02+17.12%
    UV1.SI
    0.53+0.07+15.22%
    BLR.SI
    0.168+0.020+13.51%
    5PD.SI
    1.73+0.20+12.70%
    S51.SI
  • % Losers
    % Losers
    NamePriceChange% Chg
    0.13-0.05-27.37%
    AZR.SI
    0.40-0.08-16.67%
    BLA.SI
    0.094-0.014-12.96%
    5DL.SI
    0.179-0.020-10.05%
    BDN.SI
    0.20-0.02-9.09%
    SK3.SI

Market Data

  • Currencies
    Currencies
    NamePriceChange% Chg
    1.4158+0.0026+0.18%
    USDSGD=X
    1.4937+0.0022+0.15%
    EURSGD=X
    1.7614+0.0014+0.08%
    GBPSGD=X
    80.0010-0.169-0.21%
    SGDJPY=X
    5.4798-0.0103-0.19%
    SGDHKD=X
    3.1425-0.0083-0.26%
    SGDMYR=X
    9,472.0908-16.7539-0.18%
    SGDIDR=X
    4.8590-0.0066-0.14%
    SGDCNY=X
    1.0872-0.0006-0.06%
    AUDSGD=X
  • Commodities
    Commodities
    NamePriceChange% Chg
    1,237.80+4.50+0.36%
    GCJ17.CMX
    17.99+0.03+0.19%
    SIH17.CMX
    89.91-1.82-1.99%
    ^XAU
    2.72-0.01-0.48%
    HGH17.CMX
    54.08+0.49+0.91%
    CLJ17.NYM
  • Bonds
    Bonds
    TreasuryYield (%)Yield Change
    1.910.00
    ^FVX
    2.42-0.01
    ^TNX
    3.040.00
    ^TYX