Participation in NCAA Baseball and Fastpitch Softball

Having just stumbled upon the NCAA’s reporting on participation in college athletics, I thought I’d take a look at how many athletes participate in NCAA baseball and fastpitch softball for all of Division I, Division II, and Division III.

As a positive, women’s softball has seen an increase in participation every year since reporting began in the 2007-2008 season. However, in 2015-16 that growth in participation slowed to just 0.26 percent, which is the slowest percentage of growth in the reporting period. Softball added just 51 more players for the 2015-16 season.

In contrast, after a period of stagnation, men’s baseball has seen considerable growth in participation of late. In the past four seasons, participation in men’s baseball increased by 3,356 players. Over that same period of time participation in women’s softball increased by 1,174 players.

participation

Year Baseball Softball
2007-08 30,388 17,154
2008-09 29,816 17,489
2009-10 30,365 17,726
2010-11 31,264 18,188
2011-12 31,199 18,505
2012-13 32,450 18,671
2013-14 33,433 19,047
2014-15 34,198 19,628
2015-16 34,555 19,679

Source: http://web1.ncaa.org/rgdSearch/exec/saSearch

Acquiring NCAA Softball Stats with R

A nice instructional piece was just posted to the The Hardball Times website by Bill Petti on how to acquire NCAA baseball data. I’ve been asked many times before if the NCAA provides a database of softball data and my response is always “unfortunately, no”. By following Bill’s outline, his instructions can be modified to acquire NCAA softball data as well.

Here’s the link to the article: Research Notebook: Acquiring NCAA Baseball Stats with R.

For other data collection tools for use with women’s softball, see new-softball-research-tools-released-ncaa-softball.

If you don’t know how to use R and/or want to learn more about sabermetrics, there is a course that combines the two titled Sabermetrics 101.

Revisited: Do power-heavy or speed-heavy teams do better in the postseason?

By Matt Meuchel,

I have revisited the concept that I investigated in a previous article about power-heavy teams and speed-heavy teams. Last time I examined it, it was with 2 years of data (2013 and 2014). Now that I have 4 seasons of data I wanted to check back in on it. This time I added the 2 other parts of the equation that I didn’t check.

So here is a refresher on the methodology: I looked at the 2 different statistics to examine if a team was an All Power Team or an All Speed Team. I looked at Isolated Power (ISO) and Stolen Bases per Game (SB/G). For each of the 4 seasons I identified all teams that were 1 and 2 standard deviations above and below the mean in both statistics. From this I could find those teams that fit into 4 different categories:
1) Teams with Above Average Power and Above Average Speed
2) Teams with Below Average Power and Below Average Speed
3) Teams with Above Average Power and Below Average Speed (Termed “All Power Teams”)
4) Teams with Below Average Power and Above Average Speed (Termed “All Speed Teams”)

Here are the statistics associated with these categories:

1) There were 31 teams in 4 seasons that qualified as Teams with Above Average Power and Above Average Speed. Of those 31 teams 11 (35.5%) did not make the post season and 20 (64.5%) made the post season. Of those that made the post season 11 were Regional Teams (35.5% of total), 3 were Super Regional Teams (9.7% of total), and 6 were WCWS Teams (19.4% of total).

2) There were 40 teams in 4 seasons that qualified as Teams with Below Average Power and Below Average Speed. Of those 40 teams 40 (100%) did not make the post season and 0 (0%) made the post season.

3) There were 22 teams in 4 seasons that qualified as “All Power Teams”. Of those 22 teams 14 (63.6%) did not make the post season and 8 (36.4%) did make the post season. Of those that made the post season 6 were Regional Teams (27.3% of total), 1 was a Super Regional Team (4.5% of total), and 1 was a WCWS Team (4.5% of total).

4) There were 30 teams in 4 seasons that qualified as “All Speed Teams”. Of those 30 teams 27 (90%) did not make the post season and 3 (10%) did make the post season. Of those that made the post season all 3 of them were Regional Teams (10% of total) and none were Super Regional or WCWS teams.

In conclusion, of course teams desire to fit into category 1 where you have above average power and speed and no one wants to be in category 2 where you have below average power and speed. Even though these are very intuitive thoughts I wanted to put the stats to these two categories. Categories 3 and 4 were the categories I examined before and that I wanted to update the stats on. I will say that both Categories 3 and 4 have become less favorable toward post season play with 2 more seasons of data. Category 4 was not favorable before after 2 seasons and is even less after 4 seasons of data. Funny thing about Category 4 is that it is relatively stable (in the 4 years it had 7, 8, 7, and 8 teams fit into this category in individual years) for the amount of teams that qualify for it as well as for those that make the post season (1 team made it in 2013, 2014, and 2015 but none for 2016). Category 3 had more favorable post season stats after 2 seasons (where 50% qualified for the post season) than after 4 seasons, which was interesting. Even given that regression backward, this category has over 3 times as many of it’s members qualify for the post season compared to Category 4.

Christopher Long’s D-I Top 20

Let’s start out by saying that the number-one team in D-I softball for 2016 was not who you think it was.

Following up on the D-III top 20 and the D-II top 20 as provided by Detroit Tigers analyst Christopher Long, here are his season-ending top 20 teams in D-I softball. Long’s rankings take into account the impact of a team’s home field, their offensive and defensive strength, and their strength of schedule. And yes, there’s a surprise at #1.

Rank

School Overall Strength Home Park Offensive Strength Defensive Strength

SOS

1

Florida 7.887 0.956 2.198 0.279

1.478

2

Oklahoma 6.036 1.061 2.297 0.381

1.410

2 (tie)

Oregon 6.036 0.973 2.579 0.427

1.397

4

Michigan 6.029 0.993 2.465 0.409

1.352

5

Auburn 5.857 1.018 2.618 0.447

1.434

6

Alabama 5.159 1.000 2.218 0.43

1.468

7

Florida St. 5.029 0.984 2.25 0.447

1.384

8

Washington 4.934 0.971 2.719 0.551

1.515

9

UL Lafayette 4.706 0.998 2.246 0.477

1.241

10

Georgia 4.566 1.012 2.116 0.463 1.417

11

Missouri 4.449 0.968 2.391 0.537 1.447

12

LSU 4.336 0.907 2.060 0.475

1.478

13 Tennessee 4.321 0.993 2.231 0.516

1.406

14 Texas A&M 3.938 0.947 2.530 0.642

1.474

15

UCLA 3.861 0.992 2.289 0.593 1.531

16

Minnesota 3.754 1.021 2.017 0.537 1.294
17 James Mad. 3.752 1.054 1.485 0.396

1.149

18

Kentucky 3.648 1.051 1.531 0.420

1.351

19 Arizona 3.420 1.188 1.672 0.489

1.457

20

Utah 3.391 1.205 1.596 0.471

1.438

Yes, of course Oklahoma won the national championship. But for their body of work over the course of the season, according to Long’s calculations Florida was clearly the best team in the country.

The teams from the Women’s College World Series were well represented in the rankings, with the final 8 teams all ranked in the top 15.

And not to beat a dead horse, but what happened to #2 Oregon? Their ouster at home by UCLA in the Super Regionals is even more surprising seeing these rankings.

Christopher Long’s D-II Top 20

Following up on the D-III top 20 as provided by Detroit Tigers analyst Christopher Long, here are his season-ending top 20 teams in D-II softball. Long’s rankings take into account the impact of a team’s home field, their offensive and defensive strength, and their strength of schedule

Rank

School

Overall Strength

Home Park

Offensive Strength

Defensive Strength

SOS

1

N. Alabama 2.761 0.980 2.135 0.432 1.005

2

N. Georgia

2.579

0.933

1.787

0.387 0.971

3

Saint Leo 2.412 1.003 1.516 0.352

0.938

4 Humbldt St. 2.340 1.145 1.700 0.406

0.967

5

W. Tx. A&M 2.156 1.152 1.938 0.503 0.904
6 Valdosta St. 2.088 0.997 1.856 0.497

0.982

7

S. Arkansas 2.065 0.943 1.677 0.454 0.938
8 Armstrng St 1.870 0.967 1.641 0.491

0.999

9

U. Indy 1.820 0.983 1.580 0.486 0.83
10 Mo.-St. Lou. 1.807 0.977 1.322 0.409

0.836

11

Ark. Tech 1.806 0.905 1.509 0.467 0.935
12 WV Wesl. 1.682 1.010 1.378 0.458

0.81

13

Ala.-Hunts. 1.611 1.016 1.689 0.587 0.993
14 W. Florida 1.54 0.956 1.597 0.580

0.986

15

Cal. Baptist 1.507 0.97 1.438 0.533 0.871

16

Rollins 1.505 0.951 1.378 0.512 0.904
17 Wayne St. 1.496 1.045 1.27 0.475

0.793

18

Azusa Pac. 1.457 1.022 1.556 0.597 0.898
19 Chico St. 1.444 0.960 1.452 0.562

0.937

20 Georgia C. 1.439 0.950 1.724 0.670

0.901

 

Christopher Long’s D-III Top 20

Christopher Long, currently of the Detroit Tigers, has the ability to analyze almost any sport. This includes softball. Best of all he shares the code he uses and his findings at his GitHub site https://github.com/octonion. This is the same Christopher Long who makes an appearance in the book The Only Rule Is It Has To Work, which is a great read on the benefits and challenges of applying analytics to minor-league baseball. But I digress.

I just noticed that Christopher has updated his year-end softball rankings, which take into account the impact of a team’s home field, their offensive and defensive strength, and their strength of schedule.

Here are his rankings for D-III.

Rank School

Overall Strength

Home Park

Offensive Strength

Defensive Strength

SOS

1 Texas-Tyler

2.239

0.987

2.285

0.245

0.801

2 CMS

1.353

0.916

1.830

0.325

0.776

3 Salisbury

1.312

0.96

2.122

0.389

0.664

4 Berry

1.256

0.992

2.18

0.417

0.735

5 E. Tx. Bapt.

1.244

0.903

2.411

0.466

0.791

6 Va. Wslyn.

1.241

0.969

1.82

0.353

0.714

7 Emory

1.215

0.995

2.26

0.447

0.75

8 Rowan

1.201

1.011

2.159

0.432

0.708

9 Texas Lu.

1.192

0.926

2.167

0.437

0.667

10 Linfield

1.171

0.959

2.204

0.453

0.856

11 Luther

1.139

1.03

1.71

0.361

0.653

12 Birm.-So.

1.063

0.919

1.979

0.448

0.753

13 St. Thomas

1.047

0.894

1.693

0.389

0.705

14 Trine

0.951

1.065

1.813

0.459

0.747

15 Messiah

0.930

1.016

1.847

0.477

0.689

16 Pacific (OR)

0.924

0.96

2.001

0.521

0.856

17 Chris. Newp.

0.923

1.034

2.029

0.528

0.734

18 Whitworth

0.904

0.911

1.763

0.469

0.853

19 George Fox

0.879

1.032

1.708

0.467

0.841

20 La Verne

0.873

0.991

1.754

0.483

0.735

Since I am an assistant coach at Claremont-Mudd-Scripps Colleges, of course it is nice to see that we’re ranked #2. But what I really can’t help but notice is the strength of the West Region. In all a remarkable 9 out of top 20 teams play in the West Region and just 7 of these teams made the playoffs. Left out of the mix were #16 Pacific (25-16-1 overall) and #20 La Verne (31-11). Because the NCAA’s priority for D-III softball is saving money by flying as few teams as possible to Regional locations, not only were two teams left out of the post-season but the remaining 7 teams were fighting for just one spot in the eight-team World Series. Not surprisingly that one team from the West Region, Texas-Tyler, won the national championship.

To learn more about Christopher Long and his work, follow him on Twitter at @octonion or see his blohttp://angrystatistician.blogspot.com.

Infographic: 2016 WCWS

WCWS 2016

It turns out that a comment by Adelphi associate head coach Ophir Sadeh is right: women’s college softball is a perfect fit for television. This year’s D-I Women’s College World Series featured competitive games played within a reasonable amount of time. In fact 6 of the 15 WCWS games were decided by just 1 run and 9 games were decided by 2 runs or less. There were a record 78,072 fans in attendance in Oklahoma City and the title game between Oklahoma and Auburn easily topped the cable sports TV ratings for June 8.

Regarding Strikeouts, Softball is Becoming a Contact Sport

A trend of fewer strikeouts in D-I women’s softball continued for a sixth season in 2016. Strikeouts dropped to the lowest point since 2001, recorded at 4.59 strikeouts per 7 innings pitched.

Here are strikeouts per 7 innings pitched since 1982.

Strikeouts Scoring D-I

Strikeouts bottomed out in the heart of the small-ball era, reaching their low point in 1987 at just 2.7 strikeouts per 7 innings pitched. Strikeouts peaked in 2010 at 5.48 per 7 innings pitched.

Strikeout data was provided by Nevada head coach Matt Meuchel, who also looked deeper into strikeouts over the past 4 seasons in D-I softball. Matt found that strikeouts per plate appearance account for 24.84% of the variance in runs per game for all of the teams in D-I softball. This likely means that while strikeouts have an effect on scoring, it’s not the strong correlation that say home runs are to scoring.

Strikeout PA Scoring D-I

Matt also took a look at the relationship between strikeouts looking and runs per game over the past 4 seasons.

Strikeout Looking PA Scoring D-I

As shown above strikeouts looking accounted for 11.02% of the variance in runs per game. The weakness of the relationship can also be seen by how far each team’s point strays from the trendline.

For me I would say Babe Ruth put it best when he said, “Never allow the fear of striking out keep you from playing the game”. In women’s softball the strikeout is becoming less of a fear all the time.

Softball’s Golden Age of Defense

It’s easy to think that back in the day, when we were younger, the game was somehow better. It doesn’t seem to matter what that game is but in our memory that game was somehow at its best in the past.

For women’s softball it seems to be the many players and coaches who fondly remember softball’s small-ball era. From 1982-1992 and again from 2001-2004, D-I women’s softball was in a low run-scoring environment (see the chart on scoring in D-I softball). Since 2004 scoring has jumped over 35% making some long for the old days of pitching and defense.

But were those times really so golden? From a fielding-percentage perspective, it wouldn’t seem that they were.

Fielding D-I

Though fielding percentage may not be the ideal metric for measuring defense, it’s what we have available. And despite what we might think according to our memories, the numbers show that we now could be in the golden era of defense.

A thanks again to Nevada’s Matt Meuchel for making all of this data available.

Scoring in D-I Softball from 1982 to the Present

There are times when I wonder why I have this site, if it’s worth the trouble, and if anyone really reads it. Then there are other times, like today, where I receive a piece of research unexpectedly and it feels like Christmas. I guess that means Nevada’s head coach Matt Meuchel is Santa Clause.

Matt, who is known in softball circles as a numbers person, just sent me a package of softball research. I will try to release some of these stats on a daily basis. A big thank you to Matt for sharing!

Recently I wrote about some offensive trends in softball. Matt’s numbers go much farther than mine and thus provide a richer picture of trends in D-I softball. Here are runs per game, per team since 1982.

D-I Scoring

Scoring in D-I softball hit its low point in 1986 at just 3.02 runs per game, per team. Scoring peaked last year at 4.81, a remarkable increase of 63 percent.

Matt looked at the relationship between batting average and scoring and found a strong correlation. Batting average accounts for 84.46% of the variance in scoring over the past 35 seasons in D-I softball. The strength of this relationship is also shown by how closely each point below is located in relation to the trendline.

Batting Avg Scoring D-I

According to my rough estimate every 10 points of batting average accounts for a change in scoring of about .25 runs per game. If I take a leap and infer that a team increases its batting average by 10 points, over the course of a 50 game season that team can expect to score 12.5 more runs. Plugging that number back into the formula for the Pythagorean Theorem, such an increase would mean 1 more win for that team.

One element of batting average is home runs. Part of the problem with batting average is that it treats all hits (singles, doubles, triples, and home runs) equally. In reality we know that all hits aren’t created equal since a home run is much more valuable than a single. Matt also looked at whether scoring correlates with home runs.

HR Scoring D-I

With an R2 or coefficient of determination of .7292, scoring correlates quite well with just home runs. Using the example before of a 50 game season, if your team were to hit around six more home runs each season I would expect you to win one more game.

Matt also found that there is little relationship between stolen bases per game and the number of runs that are scored.

SB Scoring D-I

What is interesting to me is that even in years where the run-scoring environment was low, the number of bases that were stolen doesn’t appear to correlate to the number of runs scored. This seems to reinforce my previous research on stolen bases which showed that it’s not the number of bases that a team steals but how efficiently they do it that matters.

Thank you again Matt! Much more to come.