One third of CPAN distributions (33.1%) have a github repository, but which distributions are they, and are distributions more likely to have a repo if they're further up the CPAN River? This is a quick post to record the stats for future comparison.
At the QA Hackathon in Berlin in April 2015, we discussed how development practices should mature as a distribution moves up river. One of the points was that by the time a dist has reached the middle of the river, it's a good idea to have a public source code repository. This makes it easier for other people to contribute, and leaves a clear master source should you disappear for some reason.
The following table summarises how many distributions on each stage of the River have a repo listed in the distribution's metadata.
Number of downstream dependents | ||||||
---|---|---|---|---|---|---|
10k+ | 1k - 9999 | 100 - 999 | 10 - 99 | 1 - 9 | 0 | |
# dists | 45 | 195 | 570 | 1589 | 8210 | 21250 |
No repo | 17 | 62 | 190 | 737 | 4575 | 15748 |
37.8% | 31.8% | 33.2% | 46.3% | 55.6% | 74.0% |
As with the water quality metrics, the percentage improves as you go up river, until you get to the head of the river (10k+ dependents), at which point it gets worse again.
The reality is that there are a lot of CPAN distributions that have a github repo but the repo isn't listed in the distribution's metadata. The first thing I plan to do is work down from the top of the river, looking for these cases, and submitting pull requests. Feel free to help!
comments powered by Disqus