Man pages sections > man3 > Catmandu::Importer

Catmandu::Importer - Namespace for packages that can import

Catmandu::Importer(3pm) User Contributed Perl Documentation Catmandu::Importer(3pm)

NAME

Catmandu::Importer - Namespace for packages that can import

SYNOPSIS

    # From the command line
    # JSON is an importer and YAML an exporter
    $ catmandu convert JSON to YAML < data.json
    # OAI is an importer and JSON an exporter
    $ catmandu convert OAI --url http://biblio.ugent.be/oai to JSON 
    # Fetch remote content
    $ catmandu convert JSON --file http://example.com/data.json to YAML
    
    # From Perl
    
    use Catmandu;
    use Data::Dumper;
    my $importer = Catmandu->importer('JSON', file => 'data.json');
    $importer->each(sub {
        my $item = shift;
        print Dumper($item);
    });
    my $num = $importer->count;
    my $first_item = $importer->first;
    # Convert OAI to JSON in Perl
    my $importer = Catmandu->importer('OAI', url => 'http://biblio.ugent.be/oai');
    my $exporter = Catmandu->exporter('JSON');
    $exporter->add_many($importer);

DESCRIPTION

A Catmandu::Importer is a Perl package that can generate structured data from sources such as JSON, YAML, XML, RDF or network protocols such as Atom, OAI-PMH, SRU and even DBI databases. Given an Catmandu::Importer a programmer can read data from using one of the many Catmandu::Iterable methods:
    $importer->to_array;
    $importer->count;
    $importer->each(\&callback);
    $importer->first;
    $importer->rest;
    ...etc...
Every Catmandu::Importer is also Catmandu::Fixable and thus inherits a 'fix' parameter that can be set in the constructor. When given a 'fix' parameter, then each item returned by the generator will be automatically Fixed using one or more Catmandu::Fixes. E.g.
    my $importer = Catmandu->importer('JSON',fix => ['upcase(title)']);
    $importer->each( sub {
        my $item = shift ; # Every $item->{title} is now upcased... 
    });
    # or via a Fix file
    my $importer = Catmandu->importer('JSON',fix => ['/my/fixes.txt']);
    $importer->each( sub {
        my $item = shift ; # Every $item->{title} is now upcased... 
    });

CONFIGURATION

file
Read input from a local file given by its path. If the path looks like a url, the content will be fetched first and then passed to the importer. Alternatively a scalar reference can be passed to read from a string.
fh
Read input from an IO::Handle. If not specified, Catmandu::Util::io is used to create the input stream from the "file" argument or by using STDIN.
encoding
Binmode of the input stream "fh". Set to ":utf8" by default.
fix
An ARRAY of one or more Fix-es or Fix scripts to be applied to imported items.
data_path
The data at "data_path" is imported instead of the original data.
 
   # given this imported item:
   {abc => [{a=>1},{b=>2},{c=>3}]}
   # with data_path 'abc', this item gets imported instead:
   [{a=>1},{b=>2},{c=>3}]
   # with data_path 'abc.*', 3 items get imported:
   {a=>1}
   {b=>2}
   {c=>3}
    
variables
Variables given here will interpolate the "file" and "http_body" options. The syntax is the same as URI::Template.
 
    # named arguments
    my $importer = Catmandu->importer('JSON',
        file => 'http://{server}/{path}',
        variables => {server => 'biblio.ugent.be', path => 'file.json'},
    );
    # positional arguments
    my $importer = Catmandu->importer('JSON',
        file => 'http://{server}/{path}',
        variables => 'biblio.ugent.be,file.json',
    );
    # or
    my $importer = Catmandu->importer('JSON',
        url => 'http://{server}/{path}',
        variables => ['biblio.ugent.be','file.json'],
    );
    # or via the command line
    $ catmandu convert JSON --file 'http://{server}/{path}' --variables 'biblio.ugent.be,file.json'
    

HTTP CONFIGURATION

These options are only relevant if "file" is a url. See LWP::UserAgent for details about these options.
http_body
Set the GET/POST message body.
http_method
Set the type of HTTP request 'GET', 'POST' , ...
http_headers
A reference to a HTTP::Headers objects.

Set an own HTTP client

user_agent(LWP::UserAgent->new(...))
Set an own HTTP client

Alternative set the parameters of the default client

http_agent
A string containing the name of the HTTP client.
http_max_redirect
Maximum number of HTTP redirects allowed.
http_timeout
Maximum execution time.
http_verify_hostname
Verify the SSL certificate.
http_retry
Maximum times to retry the HTTP request if it temporarily fails. Default is not to retry. See LWP::User::UserAgent::Determined for the HTTP status codes that initiate a retry.
http_timing
Maximum times and timeouts to retry the HTTP request if it temporarily fails. Default is not to retry. See LWP::User::UserAgent::Determined for the HTTP status codes that initiate a retry and the format of the timing value.

METHODS

first, each, rest , ...

See Catmandu::Iterable for all inherited methods.

CODING

Create your own importer by creating a Perl package in the Catmandu::Importer namespace that implements "Catmandu::Importer". Basically, you need to create a method 'generate' which returns a callback that creates one Perl hash for each call:
    my $importer = Catmandu::Importer::Hello->new;
    $importer->generate(); # record
    $importer->generate(); # next record
    $importer->generate(); # undef = end of stream
Here is an example of a simple "Hello" importer:
    package Catmandu::Importer::Hello;
    use Catmandu::Sane;
    use Moo;
    with 'Catmandu::Importer';
    sub generator {
        my ($self) = @_;
        state $fh = $self->fh;
        my $n = 0;
        return sub {
            $self->log->debug("generating record " . ++$n);
            my $name = $self->fh->readline;
            return defined $name ? { "hello" => $name } : undef;
        };
    }
    1;
This importer can be called via the command line as:
    $ catmandu convert Hello to JSON < /tmp/names.txt
    $ catmandu convert Hello to YAML < /tmp/names.txt
    $ catmandu import Hello to MongoDB --database_name test < /tmp/names.txt
Or, via Perl
    use Catmandu;
    my $importer = Catmandu->importer('Hello', file => '/tmp/names.txt');
    $importer->each(sub {
        my $items = shift;
    });

SEE ALSO

Catmandu::Iterable , Catmandu::Fix , Catmandu::Importer::CSV, Catmandu::Importer::JSON , Catmandu::Importer::YAML
2017-10-01 perl v5.26.0