perl - I'm not getting HTML tag while parsing -


the fragment of html code want parse this:

<ul class="authors">     <li class="author" itemprop="author" itemscope="itemscope" itemtype="http://schema.org/person">         <a href="/search?facet-creator=%22charles+l.+fefferman%22" itemprop="name">charles l. fefferman</a>,     </li>     <li class="author" itemprop="author" itemscope="itemscope" itemtype="http://schema.org/person">         <a href="/search?facet-creator=%22jos%c3%a9+l.+rodrigo%22" itemprop="name">josé l. rodrigo</a>     </li> 

i want extract whole <a> elements, while i'm trying parse www::mechanize::treebuilder content names of authors. so:

content i'm expecting:

<a href="/search?facet-creator=%22charles+l.+fefferman%22" itemprop="name">charles l. fefferman</a>,  <a href="/search?facet-creator=%22jos%c3%a9+l.+rodrigo%22" itemprop="name">josé l. rodrigo</a> 

content i'm receiving:

charles l. fefferman, josé l. rodrigo 

here code responsible parsing this:

my $mech = www::mechanize->new(); www::mechanize::treebuilder->meta->apply($mech); $mech->get($addressdio);  @authors = $mech->look_down('class', 'author');  print "authors: <br />"; foreach ( @authors ) {     $_->as_text(), "<br />"; } 

i thought as_text(), , while cgi gets html doesn't take text.

i handled it, totally different way - using html::tagparser:

my $html = html::tagparser->new("overwrite.xml"); @li = $html->getelementsbyattribute('class','author');  foreach(@li){     $a = $_->firstchild();     $link = $a->getattribute('href');     $_->innertext;      $link; } 

Comments

Popular posts from this blog

c# - Better 64-bit byte array hash -

webrtc - Which ICE candidate am I using and why? -

php - Zend Framework / Skeleton-Application / Composer install issue -