2017 is mercifully drawing to a close for the PIttsburgh Pirates, and a look ahead to 2018 shows several scenarios taking shape.

With the releasing of Juan Nicasio, the Pittsburgh Pirates organization has begun looking to the 2018 season as their next chance to compete, so this seems as good a time as any to begin speculating and projecting how the 2018 team may do.

To do this we first need to make a statistical model of what makes a team win and lose.

The obvious

The two most obvious contributing factors contributing to wins and losses are Runs Scored and Runs Against. In fact, wins as a function of runs scored and runs against, has a 93% correlation, making it a really good method of projecting a team’s record.

The problem is that it is very hard to predict these two numbers on their own, so instead we have to break both runs scored and runs against down into their constituent parts. In other words, we have to find reliable methods for estimating the number of runs scored and runs against, in order to project wins.

Let’s start off with runs scored.

In the sabermetric era, the number of offensive statistics has exploded from a few like batting average and RBI, to a countless numbers of bizarre acronyms all attempting to best describe offense. Many people point to statistics like wRC+ or wOBA as the best measures for offense, however, when we are looking for these statistics’ correlation to runs, it is clear that they are inferior at predicting runs, relative to OPS, or more accurately OPS’s constituent parts, OBP and SLG.

Here are the correlation coefficients for each statistic:

Statistic R
wRC+ 78.6%
wOBA 92.9%
OPS 93.9%
OBP and SLG 94.1%

Since we want the best predictor of runs, we choose the statistic(s) with the greatest correlation to runs scored, which would be a combination of both OBP and SLG. Keep these two in mind; we’ll be using them in a little bit.

Now for the defensive side of the ball.

Defense is a bit trickier than offense to model, because there are two fundamentally different aspects to defense, pitching and fielding. In order to properly model runs against, we have to use statistics that attempt to isolate a player’s fielding ability, and one that attempts to isolate a pitcher’s pitching ability.

In my research for this article, I ran a number of different models using statistics like ERA, FIP and xFIP for pitching, as well as DRS, UZR, and UZR/150 for fielding. The best predictors of runs against that I could come up with is a model using FIP for pitching and UZR.

Since we are going to want to adjust UZR by plate appearances, in order to adjust for playing time, we will just divide UZR by Plate Appearances.

This model of FIP and UZR/PA correlates to runs scored with an acceptable R=89.8%.

Now we can now build a model of wins using our 4 selected statistics, OBP, SLG, FIP, and UZR/PA, and this is what it looks like;

Here’s the function we get from that regression: Wins = 30.61 + 176.95*OBP + 176.08*SLG – 19.34*FIP + 596.80 *UZR/PA

Now all we need is to plug in what we expect the Pittsburgh Pirates team numbers will be for 2018, and we will get a pretty good projection of how many wins we expect the Pirates to have next year.

This is much easier said than done.

We won’t know what kind of signings, trades, or call-ups the Pirates will make for next year until next year actually happens. Additionally, projecting things like injuries or other negative consequences is not something anyone can do with very much precision. So this next part of the will be much less scientific, and will instead be a series of possible scenarios.

Scenario 1: The Absolute Best Case (and Probably Least Likely)

This is a scenario where:

  • Andrew McCutchen plays like McCutchen of old, putting up 2014 or 2015 like numbers
  • Starling Marte puts up career average numbers
  • Gregory Polanco puts together a healthy campaign
  • Jung Ho Kang is allowed into the country/the Pittsburgh Pirates get a 3rd baseman that produces like him
  • The Pitching staff puts together something like the 2015 team where the starting and relief pitching is dominant

With some rough estimations based on previous service time in Plate Appearances, and career numbers, this team would have and .340 OBP, .420 SLG, .00154 UZR/PA, and an FIP of 3.50.

Plugging this into our formula gives us: Wins = 30.61 + 176.95 *.340 + 176.08 * .420 – 19.34 * 3.50 + 596.80 * .00154; WINS = 98 (rounded up from 97.96)

This is obviously a very hopeful prediction and would put the Pirates as favorites to win the NL Central.

Scenario 2: A More Likely Prediction

This is a scenario where:

  • McCutchen puts up 2017-like numbers
  • Marte and Polanco produce healthy, career average seasons
  • Kang cannot play and the Pirates find an average to slightly above average 3rd Baseman
  • Cole and Taillon put together All-Star-like years, Nova puts up 2017 numbers again, and the 4th and 5th starters continue to be solid
  • The bull pen improves over this year, but doesn’t get to the 2013-15 level of dominance

This scenario gives us an OPB of .330, .410 SLG, UZR/PA actually improves to .00218, and an estimated team FIP of 3.70. Giving us the equation:    Wins=30.61+176.95*.329+176.08*.407-19.34*3.68+596.80*.00218; WINS = 91 (Rounded up from 90.62)

If the Pittsburgh Pirates win 91 games next year, they would have a 96% chance of making the playoffs. With a little bit of luck, a team projected at 91 wins could easily win 95 or so and take the division. This scenario seems more likely, as the only two big changes to this 2017 team are Cole returning to form and Taillon really living up to expectations after his first full Major League year, everyone else is just doing their career average play.

Scenario 3: Without Cutch

This is the same scenario as 2 only with McCutchen being traded, and his replacement is a league average outfielder. I also assumed some additional PAs from Polanco, Marte, Frazier, and Rodriguez as a result.

This gives us, .330 OBP, .400 SLG, .00276 UZR/PA, and the same FIP of 3.70, and about 90 wins.

90 Wins means about a 92% chance of the playoffs. Interestingly, dropping McCutchen from the roster only translates into an estimated 1 fewer win. This doesn’t take into account any morale loss from losing a clubhouse leader and longtime fan favorite, however.

Scenario 4: Without Cutch II

This is a less optimistic version of scenario 3

  • McCutchen’s replacement is slightly worse than league average
  • Cole and Taillon have good but not All-Star level years

OBP: .320 SLG: .400 FIP: 3.80 UZR/PA: .00260 plugged into our formula gives us about 86 wins.

An 86 win season gives the Pirates a 25% chance of making the playoffs, but depending on the year, it could be better than that. Additionally, an 86 win team would be a very welcome improvement over this year’s club.

This is the most likely scenario, in my opinion, because it doesn’t require any players to play outside of their skill set, all that is necessary for this team to be a winning team again is for Cole, Polanco, and Marte to play to their career numbers and Tallion to continue to improve as a Major Leaguer. Using this scenario as the base, additions to this team during the off season could push that wins number up to 88-90 relatively quickly.

Scenario 5: Murphy’s Law

This is the exact opposite of scenario 1, and is equally as unlikely. This absolute worst case scenario is where anything and everything that could go wrong does.

  • McCutchen is traded and his replacement is significantly worse than average
  • Polanco has another injury plagued year
  • Marte repeats his 2017 numbers over a full season
  • The 3rd baseman they got blows out an ACL and misses half the season
  • Felipe Rivero has Tommy John’s and the rest of the bullpen falls apart
  • None of the starters have a sub 4 FIP
  • None of the replacements that the Pirates utilize to fill these holes are even halfway decent

While this scenario is as unlikely as the first, we can still put numbers to it just to see what a worst case scenario looks like.

The numbers that look like this: .320 OBP, .390 SLG, .00232 UZR/PA, and 4.50 FIP. Plugging these in give us about 70 wins.

Even in this theoretical, absolute worst case scenario, the Pittsburgh Pirates wouldn’t be a bottom 5 team in the league most seasons.

Not gospel

These projections are largely just for fun. As I said earlier, it is extraordinarily difficult to predict all of the variables that will take place between the end of October and Spring Training, let alone the kinds of injuries and other setbacks that will befall this, and every other, team in season. So, this, and every other projection should be taken with a grain of salt.

That being said, we can learn some things from these projections.

Retrospectively, these projections make the 2017 season all the more disappointing. Scenario 4 really only requires the current Pittsburgh Pirates roster, less McCutchen, to play to their career averages, then all of the sudden this 75-ish win team jumps to an 85-ish win team, signifying that the collective Pirates roster performed well under their potential this year.

As disappointing as this may be, however, it does bode well for the 2018 Pirates. Over large sample sizes, like, say two seasons of baseball, individuals tend to “regress to their average”. Fortunately for the Pirates, the 2018 team is due for positive regression in their win column.

You can also see with these projections why the front office was so hesitant to sell off key pieces at the trade deadline. Moving key players like Harrison or Cole at the deadline would have had drastic negative consequences on the ‘18 Pirates likelihood of competing, a likelihood that is actually quite good.

With some smart moves this offseason, like picking up a decent 3rd baseman, and finding one more outfielder, either externally or from the system, and some bullpen bolstering, the Pittsburgh Pirates have a legitimate shot at contending in 2018.

Nate Werner

Nate Werner is a senior at Penn State, where he is studying for his B.S. in Economics. He is a lifelong Pirates fan that uses the tools of statistical analysis to dive deeper into the numbers of baseball. His goal is to take the style of analysis used in front offices across the Major Leagues and bring it to the computer screens of everyday fans. You can read some of Nate’s more general analyses of baseball on goldboxstats.wordpress.com and follow him on Twitter @GoldBoxStats.

  • Bobby Ewing

    Very informative

  • JPksu

    Outside of the final scenario, these are ALL overly-optimistic given the current run environment. There are only 2 teams with a FIP below 4 this season. Your assumption of 3.80 FIP would’ve been top 5 even in 2016 and you present that as a likely scenario. Also, in this run environment an OPS of .730 or less will not score enough runs to generate the run differential needed to win. I’m not sure what UZR you’re starting with because the Pirates are almost always negative yet your UZR/PA is a positive number.

    An OPTIMISTIC scenario given this lineup construction would be an OPS of .750 and a FIP of 4.0. Even that only generates 85 wins which ain’t enough to reach the wild card most seasons. Hoping for a sub-4.0 FIP isn’t a good strategy. This team needs more offense. If they can get a 4.0 FIP and figure out a way to generate a .770+ OPS then they can reach the wildcard. I have no idea how they are going to find .020+ additional OPS but I guess that’s why NH got the extension…

    • JPksu

      Including projected seasons for Marte, Polanco and Kang still only made this ~82 win team which is about what I recall from the 2017 pre-season projections. So even before the setbacks this season, this team wasn’t likely to compete for a post-season spot

    • Nate Werner

      This is a very astute observation, however, it’s a misuse of the formula. I didn’t explain this in the article, but since the data used was team seasons from 2010-2016 we have to judge the numbers used based on the league averages over that period of time, to utilize the idea of above or below average hitting or pitching. So for instance an FIP of 3.80 is 0.17 better than league average form 10-16, and is within 1 standard deviation, the 2017 equivalent would be about a 4.00, but we adjust it to the averages of the time. Unfortunately just using the ’15 and ’16 seasons would be too small of a sample size to yield statistically significant results so additional seasons must be included.

      In regards to OBP and SLG, most of that boost comes from assuming Marte and Polanco will return with healthy, productive years; for instance Marte’s career OPS is ~.800, this year it is .670, Polanco’s is upper .700s, currently low .700s. Having just those two hitting at career numbers over the course of a full season boosts this team’s OBP and SLG numbers most of the way. Additions of league average hitters in both OF and 3B also improves the number, as well as some minor continued development of Josh Bell.

      Thanks for the feedback,
      Nate