CPAN Report 2020

CPANCPAN Report Wed 3 March 2021

CPAN was launched in 1995, so Perl developers have been sharing their code with each other for more than 25 years. In this post I'll share various charts that show how releases to CPAN have waxed and waned. I previously did a CPAN Report 2013, which as you'll see below, was when many measures were at or near their peak.

New user signups

In 2020, 141 people signed up for a PAUSE account. This compares with the best year, which was 2012, when 850 people signed up:

PAUSE only started recording the signup date in 1999, which is why all accounts created from 1995 through 1999 are lumped together against 1999.

In 2012 the book Intermediate Perl was released; it has a chapter that tells people to sign up for a PAUSE account and to upload a test distribution. That explains the bump from 2011 to 2012. In 2014 there were a number of activities encouraging people to get more active on CPAN, including CPAN Day.

Maybe without those things, the peak would have been 2009, and we've seen a steady decline since then.

The PAUSE admins are also a bit more cautious on handing out PAUSE accounts these days, as we're getting more people signing up for nefarious purposes.

First-time CPAN releasers

In 2020, 148 people did their first upload to PAUSE. This is seven more than signed up for an account, so clearly some people who'd signed up earlier waited a bit before releasing.

We know the date of every release right from the start, which is why this chart starts in 1995. We see the same decline since 2012, but has it plateaued, or is 2020 the lockdown effect, with more people deciding to release something than would have done otherwise?

When did people do their first release?

Prompted by a question I got after publishing this, I wondered how many people release in the same year they signed up, how many release some time later, and how many never do release.

The percentage of people who signed up in 2000 and 2001 who never released anything, is below 30%. In recent years it's above 60%.

CPAN Releasers per year

How many of those first-time releasers go on to do more releases? This next chart shows the number of people who did at least one release each year:

This one smooths off the first peak at 2003 seen in the previous chart, so we can see steady growth to 2013, and steady decline since then. Again, 2020 appears to buck the trend. What will we see at the end of 2021?

Distributions released per year

This next chart shows the number of different distributions released per year.

The peak here was reached in 2014. That is almost certainly due to CPAN Day — without it I suspect 2013 would have been the peak.

This hasn't declined as steeply as the number of releasers, so those who are releasing, are releasing slight more distributions each.

Minimum Perl Version

61% of CPAN distributions don't specify a minimum version of Perl required to run them. The following chart shows which are the most popular versions, where a version is specified:

Where distributions specified 5.10.0 or 5.10.1, etc, I lumped them together under 5.10, to keep things simple.

The three big hitters are 5.6, 5.8, and 5.10, with 5.14 in fourth place:

The River of CPAN

The following chart shows the number of CPAN distributions at each stage in the River of CPAN.

There are nearly 39,000 distributions on CPAN, and of those just under 27,000 (69%) are not used by any other distribution on CPAN. 9008 distributions (23%) are relied on by between 1 and 9 other distributions on CPAN, and so on. At the far end of the river, there are 74 distributions (0.2%) that are used by more than ten thousand other distributions. Most of those 74 are core modules (i.e. they are shipped with Perl itself).

The Age of the River

People have been releasing to CPAN for more than 25 years. Some distributions that were released in the early days are still getting occasional releases, but many aren't.

This chart tries to show how the average age of distributions changes as you move up the river.

Fewer than 10% of the distributions in the first stage of the river have had a release in the last year (the blue part of the leftmost column), and 38% haven't had a release in the last 10 years (the yellow part). As we move up the river, the recent-release percentage increases, and at the top, 86% of the top 74 distributions have had release in the last year. The sudden jump in the blue part of the right-hand column is down to all the core modules in that pot (because Perl is released every year).

This is roughly what you'd hope to see: as a distribution moves up river, the more frequently it is likely to be released. Or perhaps people only use modules that have been released recently?

It's worth noting that some modules won't be getting regular releases because they're finished. This is often true for simpler modules that provide one function.

Conclusion

The number of new CPAN authors has been steadily declining for the last 6 years. The number of releases has also been declining, but not as steeply, and maybe that decline is easing off, but it's too soon to say if that was the effect of COVID.

What other charts would be interesting to see?

comments powered by Disqus