Introducing CPAN::Releases::Latest

CPANiteratorsMetaCPAN Tue 29 April 2014

Last year I released PAUSE::Packages, which provides a simple interface for iterating over the latest non-developer release of all dists currently on CPAN. It's based on the PAUSE file 02packages.details.txt, which, by definition, doesn't contain developer releases. But sometimes we (or, I, at least) want to know about all recent releases, including developer releases. This prompted me to create CPAN::Releases::Latest.

A little while back I wrote about a CPAN author's dashboard. In between various other projects I'm slowly working towards creating a site that will provide a dashboard for all CPAN authors. I want my dashboard to show the latest release for all my dists, and if the latest is a developer release, then I also want to see the latest non-developer release.

So to build a site with everyone's dashboard, I need an index of all the latest releases on CPAN, including developer releases.

I tried to find a place where I could get such a list, or at least the parts. But there isn't one (that I could find). So I suggested that maybe PAUSE could provide a new index. Andreas pointed out that I could use the MetaCPAN API to solve my problem. So I did.

Here's the code to iterate over all releases:

use CPAN::Releases::Latest;

my $latest   = CPAN::Releases::Latest->new(max_age => '1 day');
my $iterator = $latest->release_iterator();

while (my $release = $iterator->next_release) {
    print $release->path, "\n";
}

If there's a non-developer release for a dist, you'll see that first, and if there's a more recent developer release, you'll see the latest developer release after that. So for each dist you'll see at most two releases, and the developer release always comes after the non-developer release. For the Graph distribution, you'll currently see two paths from the above code:

J/JH/JHI/Graph-0.96.tar.gz
N/NE/NEILB/Graph-0.96_01.tar.gz

The release object has the following attributes:

So if you wanted to just process all developer releases that are the latest release for dists (eg in a smoker), you could write:

while (my $release = $iterator->next_release) {
    next unless $release->distinfo->maturity eq 'developer';
    ...
}

I can easily add more information to this index, if you're interested in using this module, but want more information that MetaCPAN provides.

Under the hood

Under the hood it uses the shiny new MetaCPAN::Client to run a single query against MetaCPAN. Given the nature of Elasticsearch, it can't get only the information I need, so it processes the results and generates a local cache of the information needed for the iterator.

If you look at the first code sample above, you'll see that you can pass a max_age to the constructor. This is converted to a number of seconds using Time::Duration::Parse. If you already have a cached index, and it was generated within the last max_age seconds, then that will be used. Otherwise the index will be regenerated, and that will be used.

Part of my motivation to get this done came from talking to Paul Johnson at the QA Hackathon, and realising that we had similar needs for iterating over CPAN.

Thanks to ANDK for pointing me in this direction, and MICKEY for suggestions on improving the way I use MetaCPAN::Client, and tweaking the client module to support my needs.

comments powered by Disqus