CPAN modules for locating an IP address

Neil Bowers

2012-08-08

This is a review of 12 CPAN modules which can be used to get information about the location of an IP address. As a minimum, I'm looking for modules which will return a country code, given an IP address or fully-qualified domain name (FQDN).

There are a number of reasons why you might want this:

Here are the 12 modules; please let me know if there are any modules I've missed.

Module Doc Version Author # bugs # users Last update
Geo::Coder::HostIP CPAN 0.04 Neil Bowers 0 0 2012-01-15
Geo::IP CPAN 1.40 Boris Zentner 6 5 2011-08-23
Geo::IP2Location CPAN 4.00 IP2Location 4 0 2012-04-04
Geo::IP::RU::IpGeoBase CPAN 0.03 Руслан У. Закиров 0 0 2010-02-11
Geo::IPfree CPAN 1.121660 Brian Cassidy 0 1 2012-06-14
IP::Country CPAN 2.27 Nigel Wetters Gourlay 2 5 2009-07-25
IP::Country::DB_File CPAN 2.01 Nick Wellnhofer 0 0 2011-01-19
IP::Country::DNSBL CPAN 1.02 Nigel Wetters Gourlay 0 0 2006-12-19
IP::Info CPAN 0.05 Mohammad S Anwar 0 0 2011-09-05
IP::Location CPAN 0.01 Michael Wang 1 0 2011-01-10
IP::QQWry CPAN 0.0.20 孙海军 0 1 2011-11-26
IP::World CPAN 0.37 Craig MacKenna 1 0 2010-07-03

I'll give a summary of each module in turn, then present results of evaluating the modules, and finally conclusions on which module might be the best bet in different situations.

Geo::Coder::HostIP

Geo::Coder::HostIP provides an interface to the online service for geocoding info at www.hostip.info.

use Geo::Coder::HostIP;
    
$geo     = Geo::Coder::HostIP->new();
@results = $geo->FetchIP('212.58.244.71');
if (@results > 0) {
    print "code = ", $geo->CountryCode, "\n";
} else {
    print "IP address not covered\n";
}

Every time you call FetchIP it makes a call to the online service, and populates an internal data structure. You can then call various methods to retrieve information such as latitude and longitude, city, country name, country code and state.

You can lookup either an IPv4 IP address, using FetchIP, or a hostname, using FetchName.

The service at www.hostip.info is a community-based project to geolocate IP addresses, crowdsourcing the data, and making it freely available. If you use this module, don't hammer their service please.

The version of this module previously on CPAN had a number of bugs, and the original author, Seán Cannon, gave me co-maint so I could release a version with fixes.

Geo::IP

Geo::IP provides an interface to a database from MaxMind.com, which maps IP address to country code. If installed, it will use the GeoIP C library, but if you don't have it, the module will fall back on a pure Perl implementation (included in the distribution, Geo::IP::PurePerl is now deprecated). For all my testing I had the C library installed.

use Geo::IP;

$gi     = Geo::IP->new(GEOIP_MEMORY_CACHE);
$code   = $gi->country_code_by_addr('212.58.244.71');
$code   = $gi->country_code_by_name('www.bbc.co.uk');

$record = $gi->record_by_addr('212.58.244.71');
$code   = $record->country_code;

You can pass a number of flags to the constructor. The example above uses more memory for higher performance. GEOIP_STANDARD goes for lower memory. See the documentation for further options.

The module provides two types of method:

  1. There are two methods which take an address and return a record object, which has methods for accessing information about the identified location. For example record_by_addr(), shown in the example above.
  2. There are 10+ methods which will return one specific piece of information, such as the two-letter country code (shown above), or country name.

A monthly update to the data file is available for download from maxmind.com. You can also subscribe to a commercial version ($50 initial, $12/month), which provides weekly updates, and improved accuracy.

There is also a data file which has richer information, including country name, region, city, post code, latitude and longitude, etc. This is also available either as a free data file (updated monthly), or on a commercial subscription basis ($370 initial, $90/month for updates).

You can download the C library from http://geolite.maxmind.com/download/geoip/api/c/

The module also supports IPv6, but this requires the underlying C library.

Geo::IP2Location

Geo::IP2Location provides lookup of various location information from IP, based on a commercial data file, available from ip2location.com. There are currently 20 different files available, each providing a different combination of location data, including Country, ISP, Region, City, latitude & longitude, postcode, and others.

Geo::IP2Location supports IPv4 and IPv6; it doesn't support lookup of FQDNs.

The smallest, and cheapest, database provides country information:

use Geo::IP2Location;

$gi   = Geo::IP2Location->open('IP-COUNTRY.BIN');
$code = $gi->get_country_short('212.58.244.71');
$name = $gi->get_country_long('212.58.244.71');

If you try and call a method that needs data not in the file you bought, the return value is a string saying the parameter is not available. This means you can't call the get_all() method to get the code and name of the country with a single call. I've submitted a feature suggestion via RT to add a get_country() method which would return a simple data object.

The get_country_short() method returns an ISO 3166 code, but for the United Kingdom it returns 'UK', instead of the official code, 'GB'. If you pass a private IP address you'll get back '-'. I don't know what you'll get back if you pass an IP address not covered by the database.

You can download sample databases from http://www.ip2location.com/developers.htm

If you just want the country information, it costs $49/year for a single server. The subscription also gets you monthly updates.

Geo::IP::RU::IpGeoBase

Geo::IP::RU::IpGeoBase provides a mapping from IP address to location information, but only for IP addresses in Russia. It's based on data provided by http://ipgeobase.ru, which you must download and load into a database. The distribution includes a script which will do this for you; for example to use SQLite:

% ip-geo-base-ru --dsn 'dbi:SQLite:geobase-ru.db' --create

Once you've done this, you can use the module to look up information on an IP address:

use Geo::IP::RU::IpGeoBase;

$gbase = Geo::IP::RU::IpGeoBase->new( db => { dsn => 'dbi:SQLite:dbname=geobase-ru.db' } );
($location) = $gbase->find_by_ip( '109.148.155.77' );
$city = $location->{city};
($latitude, $longitude) = ( $location->{latitude}, $location->{longitude} );

The find_by_ip method only supports IPv4 IP addresses. The module doesn't support IPv6, and you can't pass FQDNs.

The find_by_ipmethod returns 0, 1 or more records for the specified IP address, where is record is just a hash reference. In the underlying data a given IP address may appear in more than one record.

Geo::IPfree

Geo::IPfree uses a local data file, and returns both the 2-letter ISO country code, and the country name. It supports IPv4 addresses only, but can handle both IP addresses and FQDNs.

use Geo::IPfree;

$geo = Geo::IPfree->new();
$geo->Faster;
($code, $name) = $geo->LookUp('212.58.244.71');

The Faster method tells the module to load the data file into memory, for faster lookups. From the testing I've done, this gives about a 5% improvement, on average.

By default the module caches the result for the last up-to 1000 IP addresses. When the cache hits 1000 addresses it is cleared and starts from empty again. This means that if you look-up the same address multiple times close together, as you might when processing an HTTP access log for example, you'll see the benefit. But if they're more than 1000 lookups apart you won't see any benefit. Cache hits are beneficial, because the full lookup is relatively slow (see Comparison section below).

If you pass an FQDN, then you get an extra return value -- the IP address for the FQDN.

($code, $name, $ip) = $geo->LookUp( 'www.cnn.com' );

The module uses a data file which is freely available from WebNet77. The file is updated daily, but they suggest that most users should only update it on a monthly basis. The distribution includes a copy of the data file, and the module's author releases a new version of the module every four months or so, to ensure you have a up-to-date database. I downloaded a fresh copy for my tests: on the bottom right-hand side of the page, there's a Download section: select the Geo::IPfree format before downloading.

The module also supports a number of utility functions for processing IP addresses and FQDNs.

IP::Country

IP::Country provides an OO api on top of various databases; the main method takes an IP address or FQDN and returns a two-letter country code:

use IP::Country::Fast;

$ipc = IP::Country::Fast->new();
$country = $ipc->inet_atocc('212.58.244.71');

If you pass a private IP address, it will return **. If the IP address isn't covered, you'll get undef back.

The module supports two other methods: inet_ntocc() takes the format returned by inet_aton(3) and returns a country code. db_time() returns the creation time of the database, in seconds since the epoch.

IPv6 addresses are not supported.

IP::Country::DB_File

IP::Country::DB_File provides the same interface as IP::Country, but works with a DB_File data file, generated from publically available databases.

use IP::Country::DB_File;
    
$ipc  = IP::Country::DB_File->new( 'ipcc.db' );
$code = $ipc->inet_atocc('212.58.244.71');

It will return the pseudo-code of ** if passed a private IP address

Before you can use the module you have to build the data file, using the IP::Country::DB_File::Builder module, which is part of the distribution:

perl -MIP::Country::DB_File::Builder -e command -- -fbr

It would be more user-friendly if the distribution included a script for rebuilding the database. Furthermore, the above command just generates the database in the current directory, and you have to pass the path to it when calling the constructor. It would be even more user friendly for the database to be built when the module is installed, and put in a standard location. The command-line script could then update that location.

The module doesn't support IPv6, or the passing of FQDNs rather than IP addresses.

IP::Country::DNSBL

Note: the default service used by this module is currently offline, so this module doesn't work.

IP::Country::DNSBL provides the same interface as IP::Country, but queries a DNSBL server, by default using country.netop.org, which is provided by the NetOp organisation:

use IP::Country::DNSBL;
    
$ipc  = IP::Country::DNSBL->new();
$code = $ipc->inet_atocc('212.58.244.71');

If you don't know what DNSBL is about, NetOp have a page explaining how this works.

The module only works with IPv4 addresses and ASCII hostnames.

IP::Info

IP::Info provides an interface to the Quova service, which provides a RESTful API for getting information about an IP address. You can sign up for a free developer account on Quova, but it's limited to a maximum rate of 2 lookups per second, and 1000 lookups per day.

use IP::Info;

$ipinfo = IP::Info->new(
                        apikey => ' ... ',
                        secret => ' ... ',
                        format => 'json'
                       );
$response = $ipinfo->ipaddress('212.58.244.71');
$code     = $response->country_code();

The ipaddress method returns an instance of IP::Info::Response, a data class which has methods for getting the country code, country name, latitude and longitude, area code, and much more.

Quova boast that they have the most IP geolocation data, and the most accurate. So if you're after coverage, this might be your best bet. Using this module also has the advantage that you don't have to install a local copy of the data file, and are thus always "up to date".

This module was only recently released, but it has already evolved in a good direction, partly as a result of earlier versions of this review.

The module only supports IPv4 IP addresses (in decimal or dot-decimal notation). IPv6 and FQDNs are not supported.

In addition to the free rate-limited service, you can sign up for commercial service, which obviously isn't rate limited. You pay varying amounts per lookup; details on www.quova.com.

IP::Location

IP::Location is an interface to QQWry data, which appears to a file with location data associated with IP ranges, where the location information is in Chinese. I'm not entirely sure, since most of the resources online describing it are in Chinese.

use IP::Location;

$locator = IP::Location->new('qqwry.dat', 'UTF-8');
$location = $locator->locate('212.58.244.71');

binmode STDOUT, ":utf8";
print "location = $location\n";

This results in the following:

% ./ip-location.pl
location = 英国

Here the constructor takes two arguments. The first is the path to your local copy of the QQWry data file. The second is optional, and specifies the character set you want the returned value to be in. By default this is 'GBK'; the other supported value is 'UTF-8'.

The documentation doesn't make clear what will be returned if the data file doesn't contain any information on the passed IP address.

The SYNOPSIS section of the document starts with

no warnings;

Without this, you'll occasionally get a warning about use of an uninitialized value within the module.

The documentation suggests that the module has an online mode of working, where it will query a service, but looking at the code this clearly hasn't been implemented (yet).

The documentation also says you can download the data file from http://www.cz88.net/fox/, but I couldn't find it there. I tracked down a copy of the data file with google:

http://code.google.com/p/stid/downloads/detail?name=qqwry.dat

Using the info() method, it looks like this version is fairly recent: 2011-05-10. If anyone knows the definitive source for this file, please let me know via the comments. The format is described (in Chinese) at http://lumaqq.linuxsir.org/article/qqwry_format_detail.html.

I believe the module only supports IPv4 IP addresses, and that IPv6 and FQDNs aren't supported. I've emailed the author for confirmation.

IP::QQWry

IP::QQWry is another interface to the QQWry database.

use IP::QQWry;
    
$qqwry = IP::QQWry->new('qqwry.dat');
($country, $area) = $qqwry->query('212.58.244.71');

The documentation says that qqwry.dat uses GBK or Big5 encoding, and doesn't make clear which will be returned. I have no experience handling Chinese encodings, so haven't successfully displayed any results.

If the datafile doesn't contain information on the passed IP address, the query() method will return undef.

The query method only supports IPv4 IP addresses in decimal or dot-decimal notation. IPv6 and FQDNs are not supported.

IP::World

IP::World comes with its own data file based on two free databases. It provides a single method which returns a two-letter country code:

use IP::World;
    
$ipworld = IP::World->new(1);
$code    = $ipworld->getcc('212.58.244.71');

The constructor takes a single argument, which controls the trade-off between memory usage and performance. It specifies how the module should process the data file: 0 keeps all the data in memory, 1 uses mmap, 2 searches the data file on disk using traditional C I/O calls, and 3 uses PerlIO. I've submitted a feature request via RT that constants be defined for these values, similar to Geo::IP, for example.

The getcc() method takes an IPv4 address in either the dotted quad or decimal format. IPv6 and FQDNs are not supported.

If you pass an invalid address, the method returns **. If the database doesn't provide information for the specified address, getcc() will return ??.

The distribution contains 3 scripts, one of which can be used to update your local data file. This is used at installation time to ensure you have the latest data.

Comparison

There are two main areas in which I've compared these modules, coverage and performance. Some of the modules I couldn't really test, either because I don't speak the relevant language, or because it didn't make sense. Since I first posted this, I've been given a copy of the database for Geo::IP2Location, so I could include it in this comparison.

If you're processing logfiles, then you might be wanting to look up thousands of IP addresses per minute, and performance will be important.

But first, a summary of capabilities, by module.

Capabilities

Module IPv6 FQDN Active Extra Style
Geo::Coder::HostIP remote
Geo::IP local
Geo::IPfree local
Geo::IP::RU::IpGeoBase local
Geo::IP2Location local
IP::Country local
IP::Country::DB_File local
IP::Country::DNSBL remote
IP::Info remote
IP::Location local
IP::QQWry local
IP::World local

The non-obvious columns are:

Coverage

I generated a list of 4 hostnames for every valid country code, looked up the IP address, and then looked up the country code using the modules listed below. I ended up with a list of 940 IP addresses; I wanted under 1000, since the service behind IP::Info is limited to 1000 lookups per day.

There are two figures given for each module: coverage shows the percentage of IP addresses for which a valid country code was returned; hit rate shows what percentage of country codes seemed to be right.

CoverageHit rate
IP::Country::DB_File100.083.6
Geo::IPfree100.083.6
Geo::IP100.083.5
IP::World100.083.1
IP::Info100.082.7
Geo::IP2Location100.081.3
IP::Country::Fast91.179.4
Geo::Coder::HostIP100.068.2

Obviously not all domains are hosted in the country associated with the country code, so I had to use some heuristics to decide what the correct country code should be:

The problem with this approach is that one module could be right and all the others wrong, and there isn't an independent source I can use to verify the location (that I know of). As I've added modules to this bake-off some of the figures have changed quite a bit. For a future version I'll try and generate error bounds for the hit rate.

Performance

I generated a list of 100,000 different random IP addresses and saved them in a file. I then Benchmark'd each module to see how long it took to lookup the IP addresses, using the same framework code for each of them. The Benchmark only covered the time to lookup the IP addresses, it didn't cover any module initialisation time, etc.

Some of the modules are so quick that Benchmark complained that I wasn't running them long enough to get an accurate figure. So I ended up running 1 million iterations, looping over my list of IP addresses ten times. At least one of the modules caches results, so I decided to do two runs: (1) 10 iterations over 100k addresses, and (2) 1 million unique IP addresses. Times are given in seconds:

Module10 x 100k1 million
Geo::IP0.880.78
IP::World1.030.90
IP::Country::DB_File6.953.76
IP::Country::Fast16.5317.17
IP::QQWry29.94208.76
IP::Location198.28195.53
Geo::IPfree1264.641192.05
Geo::IP2Location9333.879388.03

As you can see, IP::QQWry is a lot faster when seeing repeat IP addresses. Interestingly, IP::Country::DB_File appears to be slower when seeing repeat IP addresses.

I didn't include the following modules in the performance comparison:

Conclusion

There's not much difference between Geo::IP, IP::World, IP::Info, Geo::IPfree, and IP::Country::DB_File.

comments powered by Disqus