NAME
   Web::PageMeta - get page open-graph / meta data

SYNOPSIS
       use Web::PageMeta;
       my $page = Web::PageMeta->new(url => "https://www.apa.at/");
       say $page->title;
       say $page->image;

   async fetch previews and images:

       use Web::PageMeta;
       my @urls = qw(
           https://www.apa.at/
           http://www.diepresse.at/
           https://metacpan.org/
           https://github.com/
       );
       my @page_views = map { Web::PageMeta->new( url => $_ ) }
               @urls;
       Future->wait_all( map { $_->fetch_image_data_ft, } @page_views )->get;
       foreach my $pv (@page_views) {
           say 'title> '.$pv->title;
           say 'img_size> '.length($pv->image_data);
       }

       # alternativelly instead of Future->wait_all()
       use Future::Utils qw( fmap_void );
       fmap_void(
           sub { return $_[0]->fetch_image_data_ft },
           foreach    => [@page_views],
           concurrent => 3
       )->get;

DESCRIPTION
   Get (not only) open-graph web page meta data. can be used in both normal
   and async code.

   For any other than 200 http status codes during data downloads,
   HTTP::Exception is thrown.

ACCESSORS
 new
   Constructor, only "url" is required.

 url
   HTTP url to fetch data from.

 user_agent
   User-Agent header to use for http requests. Default is one from Chrome
   89.0.4389.90.

 extra_headers
   HashRef with extra http request headers.

 cookie_jar
   Accepts optional HTTP::Cookies compatible object that must provide
   "get_cookies()" method. If set will send http cookie headers with each
   request.

 title
   Returns title of the page.

 description
   Returns description of the page.

 image
   Returns image location of the page.

 image_data
   Returns image binary data of "image" link.

   Will throw 404 exception if there is not "image" link.

 page_meta
   Returns hash ref with all open-graph data.

 extra_scraper
   Web::Scraper object to fetch image, title or description from different
   than default location.

       use Web::Scraper;
       use Web::PageMeta;
       my $escraper = scraper {
           process_first '.slider .camera_wrap div', 'image' => '@data-src';
       };
       my $wmeta = Web::PageMeta->new(
           url => 'https://www.meon.eu/',
           extra_scraper => $escraper,
       );

 page_body_hdr
   Returns array ref with page [$body,$headers]. Can be useful for
   post-processing or special/additional data extractions.

 fetch_page_meta_ft
   Returns future object for fetching paga meta data. See "ASYNC USE". On
   done "page_meta" hash is returned.

 fetch_image_data_ft
   Returns future object for fetching image data. See "ASYNC USE" On done
   "image_data" scalar is returned.

 fetch_page_body_hdr_ft
   Returns future object for fetching page content and headers. See "ASYNC
   USE" On done "page_body_hdr" array ref is returned.

ASYNC USE
   To run multiple page meta data or image http requests in parallel or to
   be used in async programs "fetch_page_meta_ft" and fetch_image_data_ft
   returning Future object can be used. See "SYNOPSIS" or t/02_async.t for
   sample use.

SEE ALSO
   <https://ogp.me/>

AUTHOR
   Jozef Kutej, "<jkutej at cpan.org>"

LICENSE AND COPYRIGHT
   Copyright 2021 [email protected]

   This program is free software; you can redistribute it and/or modify it
   under the terms of either: the GNU General Public License as published
   by the Free Software Foundation; or the Artistic License.

   See http://dev.perl.org/licenses/ for more information.