CPAN modules for getting module dependency information

other reviews

Neil Bowers

2012-08-29

This is a comparison of modules that can be used to get dependency information for Perl modules. I was working on a review of modules for making HTTP requests, and realised that some of them were pulling in a lot of other modules. I wanted to be able to visualize this, so had a look to see what was on CPAN.

If you don't want to read this (long) review, skip to the Conclusion.

The following is a list of the modules I'm aware of so far. Please let me know if I've missed any: neilb at cpan dot org.

Module Doc Version Author # bugs # users Last update
App::FatPacker::Trace CPAN 0.009009 Karen Etheridge 1 4 2012-08-03
CPAN::Dependency CPAN 0.15 S├ębastien Aperghis-Tramoni 1 0 2008-03-05
CPAN::FindDependencies CPAN 2.4 David Cantrell 1 0 2012-05-27
Devel::Dependencies CPAN 1.01 Neil Bowers 0 0 2012-08-10
Devel::Loaded CPAN 1.10 Mark Leighton Fisher 5 1 2008-02-29
Devel::Modlist CPAN 0.801 Randy J Ray 0 0 2008-09-05
Devel::TraceDeps CPAN v0.0.3 Eric Wilhelm 0 0 2009-01-31
Devel::TraceLoad CPAN 1.04 Andy Armstrong 0 1 2009-06-15
Devel::TraceUse CPAN 2.06 Philippe Bruhat (BooK) 2 1 2012-01-14
Devel::VersionDump CPAN 0.02 Rob Hoelz 0 0 2011-01-14
Dist::Requires CPAN 0.008 Jeffrey Ryan Thalhammer 0 1 2012-08-20
HTML::Perlinfo::Loaded CPAN 1.02 Michael Accardo 2 1 2011-06-13
Module::Dependency::Grapher CPAN 6632 Tim Bunce 2 0 2006-07-12
Module::Depends CPAN 0.16 Richard Clamp 8 6 2012-05-03
Module::Depends::Tree CPAN 1.00 Andy Lester 0 0 2006-11-24
Module::Extract::Use CPAN 1.03 brian d foy 1 2 2012-08-02
Module::ExtractUse CPAN 0.28 Thomas Klausner 8 5 2012-08-21
Module::Info CPAN 0.32 Mattia Barbon 5 11 2010-09-08
Module::Inspector CPAN 1.05 Adam Kennedy 2 1 2008-08-16
Module::MakefilePL::Parse CPAN 0.12 Robert Rothenberg 1 1 2004-09-03
Module::Overview CPAN 0.01 Jozef Kutej 0 0 2010-09-26
Module::ParseDeps CPAN 0.02 Robert Rothenberg 0 0 2004-07-19
Module::PrintUsed CPAN 0.05 Christian Renz 2 0 2009-09-08
Module::ScanDeps CPAN 1.08 Roderich Schupp 11 10 2012-02-21
Module::Used CPAN v1.3.0 Elliot Shank 0 1 2012-08-28
Perl::PrereqScanner CPAN 1.014 Ricardo SIGNES 4 8 2012-07-26

I've also included a module that I've written, Devel::DependencyGrapher, since none of the other modules did quite what I was looking for. It's not on CPAN yet.

There are three basic approaches taken by the modules described below:

  1. Parsing one of the metadata files in the target module's distribution. This obviously relies on the module's author correctly listing all dependencies, and might not distinguish between build/test and run-time dependencies.
  2. Parsing the source of the target module, looking for use and require statements. This will find most potential dependencies, but in doing so might report modules that will never be used on your platform. It will also miss modules loaded via DBI->connect, some plugin mechanism, eval, or require $module, for example File::Spec's require "File/Spec/$module.pm";.
  3. Looking to see what modules are loaded during runtime, using one of several approaches. The advantage of this approach is that you only see the modules that were actually used, and the File::Spec trickery won't fool it. The disadvantage is that you might miss dependencies, for example if the modules used vary based on the inputs.

Each approach has its use, but be aware of the limitations when you're using any of these.

Each module is presented in turn, with a comparison and conclusion at the end.

App::FatPacker::Trace

App::FatPacker::Trace has an CHECK block that looks at %INC to see what modules were loaded during the runtime of your script. In its import() function it takes a copy of %INC, then it compares that with %INC in the CHECK block. By default the list of modules used is written to file fatpacker.trace in the current directory. Here's how I used it with my HTTP::Client test script:

% perl -MApp::FatPacker::Trace http-client.pl

Which produces the following output:

warnings/register.pm
Carp.pm
vars.pm
Socket.pm
Errno.pm
HTTP/Client.pm
Config.pm
Fcntl.pm
HTTP/Lite.pm

Note that modules are given as partial paths rather than module names, as that's what is used as the key in %INC.

You can pass an option to change where the output is written; the following shows how to get the list written to stdout:

% perl '-MApp::FatPacker::Trace=>&STDOUT' http-client.pl

CPAN::Dependency

CPAN::Dependency uses CPANPLUS to get information about modules and build dependendy information. It can also get the information from a CPANTS database, though I haven't tried that. You can save the dependency information as YAML:

use CPAN::Dependency;
$dep = CPAN::Dependency->new(verbose => 1);
$dep->process('HTTP::Client');
$dep->run();
$dep->save_deps_tree(file => 'cpan-dependency-output.yaml');

When you run this, you get the following to stdout:

HTTP::Client => HTTP-Client 1.52 by NEILB (Neil Bowers)
  prereqs: Carp, HTTP::Lite
  >> Carp is in Perl core

And the output generated is:

---
HTTP-Client:
  author: Neil Bowers
  cpanid: NEILB
  prereqs:
    HTTP-Lite: 1
  score: 0
  used_by: {}

CPAN::FindDependencies

CPAN::FindDependencies works by fetching the META.yml or Makefile.PL for distributions using search.cpan.org. The following shows how to get dependencies for my test module:

use CPAN::FindDependencies;
@deps = CPAN::FindDependencies::finddeps('HTTP::Client');
open($fh, '>', $outfile);
foreach my $dep (@deps) {
    print $fh ' ' x $dep->depth;
    print $fh $dep->name, " [", $dep->distribution(), "]\n";
}

This takes a little while to run, then you get the following output:

HTTP::Client [N/NE/NEILB/HTTP-Client-1.52.tar.gz]
 HTTP::Lite [A/AD/ADAMK/HTTP-Lite-2.3.tar.gz]
  ExtUtils::MakeMaker [M/MS/MSCHWERN/ExtUtils-MakeMaker-6.62.tar.gz]
   File::Spec [S/SM/SMUELLER/PathTools-3.33.tar.gz]
    Scalar::Util [P/PE/PEVANS/Scalar-List-Utils-1.25.tar.gz]
     Test::More [M/MS/MSCHWERN/Test-Simple-0.98.tar.gz]
      Test::Harness [O/OV/OVID/Test-Harness-3.25.tar.gz]
    Carp [Z/ZE/ZEFRAM/Carp-1.26.tar.gz]
     warnings [F/FL/FLORA/perl-5.15.4.tar.gz]
     Exporter [T/TO/TODDR/Exporter-5.66.tar.gz]
   Pod::Man [R/RR/RRA/podlators-2.4.2.tar.gz]
    Pod::Simple [D/DW/DWHEELER/Pod-Simple-3.22.tar.gz]
     Pod::Escapes [S/SB/SBURKE/Pod-Escapes-1.04.tar.gz]
     Text::Wrap [M/MU/MUIR/modules/Text-Tabs+Wrap-2009.0305.tar.gz]
     Test [S/SB/SBURKE/Test-1.25.tar.gz]
    Encode [D/DA/DANKOGAI/Encode-2.44.tar.gz]

Devel::Dependencies

Devel::Dependencies works at compile time. It uses the %INC hash to identify what modules have been loaded.

The following is a minimal script that uses HTTP::Client to GET a web page:

use HTTP::Client;
$client   = HTTP::Client->new();
$response = $client->get('http://perl.org/');

To identify dependencies, you run the following:

% perl -MDevel::Dependencies http-client.pl

When you run this, you get the following output:

Devel::Dependencies finds 11 dependencies:
  Carp.pm
  Config.pm
  Errno.pm
  Exporter.pm
  Fcntl.pm
  HTTP/Client.pm
  HTTP/Lite.pm
  Socket.pm
  XSLoader.pm
  vars.pm
  warnings/register.pm

This shows you all modules that were loaded, but no sense of which modules are dependent on which.

This module was written by Jean-Louis Leroy, as a by-product of writing his article A Timely Start, where he describes investigating why a Perl script ran 10 times slower than the shell script it replaced. Interestingly, it was a similar line of questioning that led me to do this review, and now I've taken over maintenance of this module.

Devel::DependencyGrapher

Having looked at all the other modules presented here, none of them did exactly what I was after. So I've created a module which just about does what I wanted. It's not on CPAN yet, partly because I'm not sure what to call it.

My first attempt used the trick of putting a function at the head of @INC, but the trouble is that you only get to hear about the first time a module is used/required.

So the current version works by overriding require and logging information to a text file. Using the script presented in the previous section, here's a simple way to use Devel::DependencyGrapher:

% perl -MDevel::DependencyGrapher http-client.pl

This generates a data file, which is currently a simple text file. I've got a script takes one or more module names, reads the data file, and spits out a graph in the dot format used by graphviz. This lets you generate a data file for a large application that uses a lot of modules, and then pick a specific module to see the dependency graph for.

% ddg2dot HTTP::Client > http-client.dot

You can then generate this in various formats. Here's how I generated a PNG file:

% dot -Tpng -Grankdir=TB -Nfontsize=10 -Nshape=rect -ohttp-client.png http-client.dot

And here's the generated graph (click to see it full size):

There are a number of things I might add in the near future: the ability to leave out core modules; mapping modules to distributions. Now I'm maintainer of Devel::Dependencies, I will probably fold this module into that distribution, maybe as Devel::Dependencies::Grapher, thought that's a bit of a mouthful!

Devel::Loaded

Devel::Loaded is part of the pmtools distribution, that was original written by Tom Christiansen. It has an END block that dumps out the paths from %INC. Here's the use with my standard HTTP::Client test script:

% perl -MDevel::Loaded http-client.pl

When you run this, you get the following to stdout:

/usr/local/lib/perl5/5.16.0/XSLoader.pm
/usr/local/lib/perl5/5.16.0/warnings/register.pm
/usr/local/lib/perl5/5.16.0/Carp.pm
/usr/local/lib/perl5/5.16.0/vars.pm
/usr/local/lib/perl5/5.16.0/Exporter.pm
/usr/local/lib/perl5/5.16.0/strict.pm
/usr/local/lib/perl5/5.16.0/darwin-2level/Socket.pm
/usr/local/lib/perl5/site_perl/5.16.0/Devel/Loaded.pm
/usr/local/lib/perl5/5.16.0/darwin-2level/Errno.pm
/usr/local/lib/perl5/5.16.0/warnings.pm
/usr/local/lib/perl5/site_perl/5.16.0/HTTP/Client.pm
/usr/local/lib/perl5/5.16.0/darwin-2level/Config.pm
/usr/local/lib/perl5/5.16.0/darwin-2level/Fcntl.pm
/usr/local/lib/perl5/site_perl/5.16.0/HTTP/Lite.pm

The code for this module is so brief, we can show it in full here:

# Loaded.pm -- show what files were loaded 
# tchrist@perl.com

package Devel::Loaded;
 
$VERSION = '1.10';
 
BEGIN { %Seen = %INC } 
 
END { 
    delete $INC{"Loaded.pm"};
 
    for my $path (values %INC) {
        print "$path\n" unless $Seen{$path};
    } 
}

The module was obviously called Loaded in the past; it's trying to delete itself from the output, but uses the old name. I've submitted a bug on this, with suggested patch.

This is the only "%INC dumping" module that prints the path rather than the module name. I think most of the time I'd want the name, but I can imagine there might be times when this will be handy.

Devel::Modlist

Devel::Modlist has an END block that looks at %INC to see what modules were loaded during the runtime of your script, and prints out a summary, including the version of each module. Here's how I used it with my HTTP::Client test script:

% perl -d:Modlist http-client.pl

Which produces the following output:

Carp                   1.26
Config                     
Errno                  1.15
Exporter               5.66
Fcntl                  1.11
HTTP::Client           1.52
HTTP::Lite              2.4
Socket                2.001
XSLoader               0.16
vars                   1.02
warnings               1.13
warnings::register     1.02

It supports a number of options, to change what information is shown, and how. For example, to exclude core modules, use the nocore option:

% perl -d:Modlist=nocore http-client.pl
HTTP::Client           1.52
HTTP::Lite              2.4

The cpandist option will produce a list of the CPAN distributions that the script is dependent on:

% perl -d:Modlist=nocore,cpandist http-client.pl
N/NE/NEILB/HTTP-Client-1.52.tar.gz
N/NE/NEILB/HTTP-Lite-2.4.tar.gz

The documentation covers the other options, such as noversion which supppresses display of version numbers in the output.

Any use of the strict pragma won't be reported, as Devel::Modlist removes it from the output, as it uses strict itself. I think it would be better for Devel::Modlist to not use strict itself — all the tests could use strict, so the author wouldn't miss anything.

Devel::TraceDeps

To be reviewed ...

Devel::TraceLoad

To be reviewed ...

Devel::TraceUse

Devel::TraceUse displays a tree view of the modules used by your code. For example, to inspect my test script for HTTP::Client:

% perl -d:TraceUse http-client.pl

When you run this, you get the following to stdout:

Modules used from http-client.pl:
   1.  HTTP::Client 1.52, http-client.pl line 3 [main]
   2.    strict 1.07, HTTP/Client.pm line 4
   3.    warnings 1.13, HTTP/Client.pm line 5
   4.    Carp 1.26, HTTP/Client.pm line 6
   5.      Exporter 5.66, Carp.pm line 35
   6.    HTTP::Lite 2.4, HTTP/Client.pm line 7
   7.      Socket 2.001, HTTP/Lite.pm line 5
   8.        warnings::register 1.02, Socket.pm line 649
   9.        XSLoader 0.16, Socket.pm line 652
  10.      Fcntl 1.11, HTTP/Lite.pm line 6
  11.      Errno 1.15, HTTP/Lite.pm line 7
  12.        Config, Errno.pm line 8
  13.          vars 1.02, Config.pm line 11

You can also run it with perl -MDevel::TraceUse, but with -d:TraceUse you get more information. Devel::TraceUse will also show where a module failed to load, and if it doesn't know which module loaded a particular module (for example if loaded in an eval and you were using -MDevel::Dependencies), it will be listed at the end.

I had the following from the author, Philippe Bruhat (BooK), in email:

It started as a hack by chromatic in the "Perl Hacks" book. After using it for a while, I found it had some issues, so I fixed the bugs, and ended up rewriting it almost entirely. I'm now its maintainer.

Philippe gave a remote lightning talk at YAPC::Europe 2010 — a video showing the differences between Devel::TraceUse 1.00 (chromatic's) and Devel::TraceUse 2.00 (Philippe's).

Devel::TraceUse uses the technique I used with the first version of my module: it installs a coderef at the start of @INC. The referenced function then gets called every time require tries to load a module; it works out who the caller is and what module is being loaded, and stores this info. It then returns undef, so that require will continue working down @INC. The disadvantage of this approach is that the module will only see the first time a module is used. You still get to see all modules that are loaded by your code, but you don't get to see the full dependency graph.

Devel::VersionDump

To be reviewed ...

Dist::Requires

Dist::Requires extracts prerequisite information from a metadata file for a distribution. You can either tell it the path to a gzip'd tarball, or the path to the directory where a distribution has been unpacked. Dist::Requires will try and configure the distribution using its build mechanism.

The following shows how to extract pre-requisites from an unpacked HTTP::Client distribution:

use Dist::Requires;
$dr = Dist::Requires->new(filter => {});
%prereqs = $dr->prerequisites(dist => $path_to_http_client);
while (($module, $version) = each %prereqs) {
    printf "  $module => $version\n";
}

By default Dist::Requires will not list core modules. You can override this behaviour with the filter argument, which takes a hashref; in this you list any modules you want to filter out of the pre-requisites. Passing an empty hashref will show all prerequisites. You can specify the version of perl you're interested in, and Dist::Requires will only exclude core modules from that version (identified using /Module::CoreList).

When you run the above, you get the following:

  ExtUtils::MakeMaker => 0
  HTTP::Lite => 0
  Carp => 0

This doesn't provide any mechanism for recursively identifying prerequisites; you'd have to roll one of those for yourself, by repeatedly calling the prerequisites method and either finding local copies of modules or downloading their distributions. If you want to find recursive dependencies in this way, you'd be better off using CPAN::FindDependencies, as it already does that.

The documentation suggests that Dist::Requires isn't very robust (compared to CPAN and cpanm), but in the few simple examples I tried it seemed to work fine. It is some of the cleanest looking code I've seen while reviewing modules.

HTML::Perlinfo::Loaded

To be reviewed ...

Module::Dependency::Grapher

Module::Dependency includes 3 main modules: the indexer, an info module, and a grapher. The indexer parses local Perl files (modules and scripts) and extracts information from them, and then stores it in a Storable. The grapher takes one or more modules and pulls dependency information from the Storable, to produce a number of formats.

use Module::Dependency::Grapher;
use Module::Dependency::Indexer;
Module::Dependency::Indexer::setIndex( 'module-dependency-info.dat' );
Module::Dependency::Indexer::makeIndex( $path_to_core, $path_to_site_perl );

Module::Dependency::Grapher::setIndex( 'module-dependency-info.dat' );
Module::Dependency::Grapher::makeText( 'both', ['HTTP::Client'], 'md-grapher.txt', {NoLegend => 1} );
Module::Dependency::Grapher::makeImage( 'both', ['HTTP::Client'], 'md-grapher.png', {Format => 'png'} );

Here's the text version produced for HTTP::Client:

Dependency Tree
---------------

Grapher.pm 6632 - Thu Jul 12 09:31:43 2012

   ****> +- HTTP::Client
         |
  Child> +- 5, Carp, HTTP::Lite, strict, warnings
         |
  Child> +- Errno, Exporter, Fcntl, Socket, vars
         |
  Child> +- Config, Exporter::Heavy, Scalar::Util, XSLoader, warnings::register
         |
  Child> +- DynaLoader, List::Util, Scalar::Util::PP
         |
  Child> +- B, List::Util::PP, overload
         |
  Child> +- mro

And here's the image version:

The problem with parsing (rather than running) the source is that you might end up inferring dependencies inaccurately. For example, one module might have logic that decides which of two modules to require, but the parsing approach might see assume that both are dependencies.

Module::Depends

Module::Depends uses Parse::CPAN::Meta to extract dependency information from the CPAN metadata files in an unpacked distribution.

use Module::Depends;
use YAML;

$md = Module::Depends->new();
$md->dist_dir($path_to_dist);
$deps = $md->find_modules();
open($fh, '>', $outfile);
print $fh Dump $deps->requires;

I am now co-maint on HTTP::Client, and did a release where I added Carp as a pre-requisite. Here's the output after running on the previous version

---
HTTP::Lite: 0

But when I run it on my latest release, I get the following:

--- {}

I'll have to look into what's going on there, as META.yml does mention the dependencies:

requires:
  Carp: 0
  HTTP::Lite: 0

This module has a number of 'problems':

Module::Depends::Tree

Module::Depends::Tree provides the heaving lifting for the deptree script, which is included in the distribution. Neither the module nor the script have any documentation to talk of, but the script only has two command-line options: mirror identifies a CPAN mirror for the module to use; workdir specifies a local directory to use as a cache for distribution tarballs.

The following shows how to generate output for HTTP::Client:

% deptree --mirror=http://mirror.bytemark.co.uk/CPAN/ HTTP::Client

When you run this, you get the following output:

Dependency tree created Thu Jul 12 12:59:51 2012
Created with Module::Depends::Tree 1.00
$ /usr/local/bin/deptree HTTP::Client

HTTP::Client
    HTTP::Lite


Number of times each module is used
    HTTP::Client
    HTTP::Lite

2 total modules

HTTP-Client-1.52.tar.gz
    HTTP::Client
HTTP-Lite-2.3.tar.gz
    HTTP::Lite

2 total packages

Which is more human-friendly than computer-friendly. This is a handy script for identifying all the dists that a given module depends on, as it ignores core modules.

Turns out that the human friendliness was intention — Andy commented to me via email:

the intent was, indeed, to be human readable. If I recall correctly, I had a project on two different machines and I wanted to see the 150-or-so modules that the project used, and what the differences were between the two machines. I would run the tree program and then diff the output to ID problems.

Module::Extract::Use

Module::Extract::Use uses PPI to parse a Perl source file and extract modules used. The simplest use is the get_modules() method, which takes a file and returns a list of modules:

use Module::Extract::Use;

$extractor = Module::Extract::Use->new;
@modules = $extractor->get_modules($path_to_http_client_pm);
print "HTTP::Client depends on:\n";
foreach my $m (@modules) {
    print "  $m\n";
}

Which produces the following output:

HTTP::Client depends on:
  strict
  warnings
  Carp
  HTTP::Lite

Module::Extract::Use also provides the get_modules_with_details() method, which returns a list of hashrefs, one per dependency. Each hash contains the following keys:

This will be clearer with an example. Given the following example source:

require 5.16.0;
use warnings 'all';
use Net::HTTP::Tiny 0.001 qw(http_get);
use constant 1.23 DEBUG => 0;

Here's a script which uses get_modules_with_details():

use Module::Extract::Use;

$extractor = Module::Extract::Use->new;
$details = $extractor->get_modules_with_details($path_to_http_client_pm);
print "HTTP::Client depends on:\n\n";
foreach my $m (@$details) {
    print "  $m->{module}:\n";
    print "     version = $m->{version}\n" if defined($m->{version});
    print "     pragma  = $m->{pragma}\n";
    print "     imports = ", join(' ', @{ $m->{imports} }),"\n";
    print "\n";
}

Which produces the following output:

HTTP::Client depends on:

  warnings:
     pragma  = warnings
     imports = all

  Net::HTTP::Tiny:
     version = 0.001
     pragma  = 
     imports = http_get

  constant:
     version = 1.23
     pragma  = constant
     imports = 

Note that require 5.16.0 doesn't get included in the output (it does with some of the modules here). And I'm not sure why the constant definition doesn't appear in the imports list.

Module::ExtractUse

Module::ExtractUse uses Parse::RecDescent to parse perl files and find use and require statements. You can either pass a string which contains the source, or the path for a file. The following shows basic use:

use Module::ExtractUse;

$extractor = Module::ExtractUse->new;
$extractor->extract_use($path_to_http_client_pm);
print "HTTP::Client depends on:\n";
foreach my $m ($extractor->array) {
    print "  $m\n";
}

When you run this, you get the following output:

HTTP::Client depends on:
  warnings
  strict
  5.8.0
  HTTP::Lite
  Carp

Notice that it has included 5.8.0 as a dependency, because it saw the line:

use 5.8.0;

I was going to report this as a bug, but noticed that the distribution has a testsuite for this (use version), so it's obviously intentional.

Module::Info

Module::Info will provide various information about a module, some of it without loading the module, and some with loading involved. Here's how you get the list of used modules:

use Module::Info;

$mi = Module::Info->new_from_module('HTTP::Client');
print "HTTP::Client uses the following modules:\n";
foreach my $m ($mi->modules_used) {
    print "  $m\n";
}

Which produces the output:

HTTP::Client uses the following modules:
  strict
  warnings
  HTTP::Lite
  Carp

Interestingly the modules_used() function appears in the documentation section that starts with the line:

WARNING! From here down reliability drops rapidly!

In addition to the above, Module::Info can also tell you: the path where the module is installed, the version, whether it's a core module, what packages are defined in it, a list of subroutines defined, superclasses, and what subroutines are called. It uses the B::Utils module and caveats "lots of cargo-culting from B::Deparse", which I enjoyed.

I haven't really done this module justice, but I can imagine that now it's tucked away in my grey matter, I might find myself using it again.

When installing this, some of the tests failed. They looked harmless so I did a force install.

Module::Inspector

Module::Inspector is a front-end to a number of modules that can be used to get information about a distribution by parsing files in the distribution. Here's how you get the list of build and run-time dependencies for a distribution:

use Module::Inspector;

$inspector = Module::Inspector->new( dist_dir => $path_to_dist );

print "Run-time dependencies:\n";
print $inspector->dist_requires->as_string, "\n";

print "Build dependencies:\n";
print $inspector->dist_build_requires->as_string, "\n";

Which produces the output:

Run-time dependencies:
Carp: 0
HTTP::Lite: 0

Build dependencies:
ExtUtils::MakeMaker: 0

The dist_requires() and dist_build_requires() methods return an instance of Module::Math::Depends. You can't get a list of modules from this, just the whole lot via as_string().

The dist_depends() method returns the union of dist_requires() and dist_build_requires().

Module::Inspector was clearly meant to provide a lot more — the TO DO section of the pod says "Implement most of the functionality". It was created by the prolific Adam Kennedy, but given he's not active in the Perl world right now, this module seems unlikely to progress.

Module::MakefilePL::Parse

Module::MakefilePL::Parse parses the contents of a Makefile.PL looking for pre-requisites specified in the format used by ExtUtils::MakeMaker and Module::Install. The following shows how to get dependencies:

use Module::MakefilePL::Parse;
use File::Slurp;

$contents = read_file($makefile_pl_path);
$parser   = Module::MakefilePL::Parse->new($contents);
$reqs     = $parser->required();

print "Dependencies are:\n";
foreach my $module (keys %$reqs) {
    print "  $module $reqs->{$module}\n";
}

Running this on the Makefile.PL for HTTP::Client, we get:

Dependencies are:
  HTTP::Lite 0
  Carp 0

The parsing is too context sensitive though, making assumptions about how you'll write your Makefile.PL. For example, the Makefile.PL for the Template Toolkit builds up a hash and then calls:

WriteMakefile( %opts );

As a result, Module::MakefilePL::Parse fails to find any dependencies.

This module is no longer supported: its CPAN support status is 'abandoned'. So don't use it.

Module::Overview

Module::Overview provides various pieces of information about a module, somewhat like Module::Info. The following shows how to get the list of modules used:

use Module::Overview;

$mo = Module::Overview->new({ module_name => 'HTTP::Client' });
$info = $mo->get();
print "HTTP::Client uses the following modules:\n";
foreach my $m (@{ $info->{ uses } }) {
    print "  $m\n";
}

Which produces the following output:

HTTP::Client uses the following modules:
  Carp
  HTTP::Lite

It uses Module::ExtractUse to get this information.

Module::Overview can also generate a summary table of all the information available:

use Module::Overview;

$mo = Module::Overview->new({ module_name => 'HTTP::Client' });
print $mo->text_simpletable();

Which produces the following output:

.------------------+--------------------------------------------------------------.
| class            | HTTP::Client                                                 |
+------------------+--------------------------------------------------------------+
| uses             | Carp                                                         |
|                  | HTTP::Lite                                                   |
+------------------+--------------------------------------------------------------+
| methods          | agent()                                                      |
|                  | content_encoding()                                           |
|                  | content_length()                                             |
|                  | content_type()                                               |
|                  | date()                                                       |
|                  | from()                                                       |
|                  | get()                                                        |
|                  | host()                                                       |
|                  | last_modified()                                              |
|                  | new()                                                        |
|                  | protocol()                                                   |
|                  | response_headers()                                           |
|                  | server()                                                     |
|                  | status_message()                                             |
|                  | title()                                                      |
|                  | warning()                                                    |
+------------------+--------------------------------------------------------------+
| methods_imported | carp()                                                       |
|                  | confess()                                                    |
|                  | croak()                                                      |
'------------------+--------------------------------------------------------------'

The get() method used in the first example takes a string from the left-hand column and returns an array ref containing what you see on the right.

Module::Overview also provides a graph() method, which returns an instance of Graph::Easy containing the same information in the table.

Module::ParseDeps

Module::ParseDeps parses files in a CPAN distribution to identify dependencies. The files it looks for are META.yml, *.meta, and Makefile.PL. If a Makefile.PL is found, it is parsed by Module::MakefilePL::Parse, which is also included in this review.

The module exports one function, parsedeps(), which takes the path to an unpacked distribution:

use Module::ParseDeps;

$reqs = parsedeps($path_to_dist);

print "Dependencies are:\n";
foreach my $module (keys %$reqs) {
    print "  $module $reqs->{$module}\n";
}

Running this on a local copy of HTTP::Client I get:

Error parsing META file:  at ./module-parsedeps.pl line 14.
Error parsing META file:  at ./module-parsedeps.pl line 14.
Dependencies are:
  HTTP::Lite 0
  Carp 0

So even though I get some error messages, it does get the dependencies right. It doesn't do so well with the Template Toolkit though: I get one error message and no dependencies.

Module::ParseDeps hasn't been updated since 2004, so I'm guessing that formats for the various files have changed, which is why it doesn't work very well.

Module::PrintUsed

Module::PrintUsed prints out a list of modules used by your code. Using my simple test script for HTTP::Client:

% perl -MModule::PrintUsed http-client.pl

Which generates the following output:

Modules used by http-client.pl:
 - Carp                      1.26     /usr/local/lib/perl5/5.16.0/Carp.pm
 - Config                             /usr/local/lib/perl5/5.16.0/darwin-2level/Config.pm
 - Errno                     1.15     /usr/local/lib/perl5/5.16.0/darwin-2level/Errno.pm
 - Exporter                  5.66     /usr/local/lib/perl5/5.16.0/Exporter.pm
 - Fcntl                     1.11     /usr/local/lib/perl5/5.16.0/darwin-2level/Fcntl.pm
 - HTTP::Client              1.52     /usr/local/lib/perl5/site_perl/5.16.0/HTTP/Client.pm
 - HTTP::Lite                2.4      /usr/local/lib/perl5/site_perl/5.16.0/HTTP/Lite.pm
 - Module::PrintUsed         0.05     /usr/local/lib/perl5/site_perl/5.16.0/Module/PrintUsed.pm
 - Socket                    2.001    /usr/local/lib/perl5/5.16.0/darwin-2level/Socket.pm
 - XSLoader                  0.16     /usr/local/lib/perl5/5.16.0/XSLoader.pm
 - strict                    1.07     /usr/local/lib/perl5/5.16.0/strict.pm
 - vars                      1.02     /usr/local/lib/perl5/5.16.0/vars.pm
 - warnings                  1.13     /usr/local/lib/perl5/5.16.0/warnings.pm
 - warnings::register        1.02     /usr/local/lib/perl5/5.16.0/warnings/register.pm

The module works by looking at the %INC hash in an END block. Unlike some of the other modules which use this technique, it also gets the version of every module used, and includes that in the output.

Note that the output includes Module::PrintUsed itself, even though it's not actually a dependency of the script. The module could just exclude itself from the list, but given it's working off %INC, it has no way of knowing whether one of the other modules used it as well.

The module also provides two functions. FormattedModulesList() returns a string that contains the table shown above. ModulesList() returns a list of hashrefs. Each has contains three keys: name, version, and path. These contain the individual items used to populate the table shown above, so could be used as follows:

use Module::PrintUsed;
use HTTP::Client;

foreach my $dep (Module::PrintUsed::ModulesList()) {
    printf "  %24s : %s\n", $dep->{name}, $dep->{version};
}

The only problem is that the default output will also still be generated. I've submitted an RT issue for this, suggesting that there be a mechanism for suppressing the default output.

Module::ScanDeps

Module::ScanDeps parses Perl source to identify modules use'd or require'd. It exports a scan_deps() function which is used to scan a number of files:

use Module::ScanDeps;

$deps = scan_deps(
                   files   => [ $path_to_http_client_pm ],
                   recurse => 0,
                   compile => 0,
                 );

print "HTTP::Client uses the following modules:\n";
foreach my $module (keys %{ $deps }) {
    $info = $deps->{$module};

    print "  $module:\n";
    print "      type    : ", $info->{type}, "\n";
    if (defined($info->{used_by})) {
        print "      used by : ", join(', ', @{ $info->{used_by} }), "\n";
    }
    if (defined($info->{uses})) {
        print "      uses    : ", join(', ', @{ $info->{uses} }), "\n";
    }
}

Which produces the following output:

HTTP::Client uses the following modules:
  warnings.pm:
      type    : module
      used by : HTTP/Client.pm
  Carp.pm:
      type    : module
      used by : HTTP/Client.pm
  HTTP/Client.pm:
      type    : module
      uses    : Carp.pm, warnings.pm, strict.pm, HTTP/Lite.pm
  strict.pm:
      type    : module
      used by : HTTP/Client.pm
  HTTP/Lite.pm:
      type    : module
      used by : HTTP/Client.pm

Note:

If you set recurse to 1, it will recurse through all dependencies found, producing the complete dependency graph. Unfortunately Module::ScanDeps currently has a bug which results in it pulling a whole load of non-dependencies if you use strict.

Note that here only 'module' types are displayed. Other types are 'autoload', 'data', and 'shared'. When recursing you might get a bunch of these, and there's currently no way to state that you only want to see entries of type 'module'.

Module::ScanDeps is a powerful module, but I wouldn't use it at the moment. Other modules are easier to use for my purposes, and it doesn't seem to be actively supported at the moment: there are too many outstanding issues, including the critical one mentioned above, which was reported in November 2011.

Module::Used

Module::Used uses PPI to parse Perl and look for statements that use or require modules. Here's how you'd use it to process a locally installed module:

use Module::Used qw(modules_used_in_modules);

@modules = modules_used_in_modules('HTTP::Client');
foreach my $m (@modules) {
    print "  $m\n";
}

Which produces the output:

  strict
  warnings
  HTTP::Lite
  Carp

Module::Used provides two other functions:

This is a nice module: a clear simple API, and seems to work fine.

Perl::PrereqScanner

Perl::PrereqScanner uses PPI to scan perl source and extract dependency information. You can scan a PPI doc, a source file, or a source string, and are returned an instance of CPAN::Meta::Requirements. The following script uses it to parse the HTTP::Client example script:

use Perl::PrereqScanner;

$scanner = Perl::PrereqScanner->new;
$prereqs = $scanner->scan_file($path_to_http_client_pm);
$hashref = $prereqs->as_string_hash();
foreach my $m (sort keys %$hashref) {
    printf "  $m\n";
}

When you run this, you get the following to stdout:

  Carp
  HTTP::Lite
  perl
  strict
  warnings

HTTP::Client uses HTTP::Lite, in case you were curious. The 'perl' entry is from the following line in HTTP::Client:

use 5.8.0;

Not sure I'd do that, but you can always skip it in the output.

As you can see, this would be most useful for generating dependency information where you have a tree of source, for example for one project. The distribution includes a script scan_prerequs, which uses File::Find to traverse a directory in this way. A neat addition to this would be an option to follow dependencies: having identified HTTP::Client as a dependency, it could search @INC looking for it, and if it's found, scan that as well.

Actually, that sounded fun, so I knocked up a script to do that, and spit out a dot format graph. Here's the interesting part of the script:

use Perl::PrereqScanner;

$scanner = Perl::PrereqScanner->new;
push(@queue, 'HTTP::Client');

while (@queue > 0) {
    $module = shift @queue;
    next if exists $seen{$module};
    $seen{$module} = 1;
    
    if (defined($module_path = find_module_path($module))) {
        $prereqs = $scanner->scan_file($module_path);
        $depsref = $prereqs->as_string_hash();
        delete($depsref->{perl});
        $deps{$module} = $depsref;
        push(@queue, keys %{ $deps{$module} });
    }
}
generate_graph();

And here's the graph generated for HTTP::Client (click for full-size):

So, very different from the graph generated by Devel::DependencyGrapher, but similar to that generated by Module::Dependency.

Comparison

As noted in the introduction, there are three basic types of module:

Conclusion

If you want to know all the distributions you'd need for a module, perhaps because you want to bundle them all together, then I think CPAN::FindDependencies is the best bet. This isn't surprising, since the module is by David Cantrell, who runs the CPAN Dependencies service.

If you're not sure you want to trust the accuracy of the metadata in distributions, then you could use one of the modules which parses code to identify all possible dependencies. For this I'd use Module::Extract::Use, as it's built on top of PPI, and provides a bit more information than Perl::PrereqScanner. I think I need to do a more rigorous bake-off to see which of the parsing-based modules is most reliable.

And if you want to know what modules are pulled in when you use a module, then use Devel::Modlist, or Devel::TraceUse, at least until I've released my module :-)

comments powered by Disqus