Perl & LWP
Perl soared to popularity as a language for creating and managing web content, but with LWP (Library for WWW in Perl), Perl is equally adept at consuming information on the Web. LWP is a suite of modules for fetching and processing web pages. The Web is a vast data source that contains everything...
Main Author: | |
---|---|
Format: | eBook |
Language: | Inglés |
Published: |
Sebastopol, California :
O'Reilly
2002.
|
Edition: | First edition |
Subjects: | |
See on Biblioteca Universitat Ramon Llull: | https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009627195306719 |
Table of Contents:
- Table of Contents; Foreword; Preface; Audience for This Book; Structure of This Book; Order of Chapters; Important Standards Documents; Conventions Used in This Book; Comments & Questions; Acknowledgments; Introduction to Web Automation; The Web as Data Source; Screen Scraping; Brittleness; Web Services; History of LWP; Installing LWP; Installing LWP from the CPAN Shell; Configuring; Obtaining help; Installing LWP; Installing LWP Manually; Download distributions; Unpack and configure; Make, test, and install; Words of Caution; Network and Server Load; Copyright; Acceptable Use; LWP in Action
- The Object-Oriented InterfaceForms; Parsing HTML; Authentication; Web Basics; URLs; An HTTP Transaction; Request; Response; LWP::Simple; Basic Document Fetch; Fetch and Store; Fetch and Print; Previewing with HEAD; Fetching Documents Without LWP::Simple; Example: AltaVista; HTTP POST; Example: Babelfish; The LWP Class Model; The Basic Classes; Programming with LWP Classes; Inside the do_GET and do_POST Functions; User Agents; Connection Parameters; Request Parameters; Protocols; Redirection; Authentication; Proxies; Request Methods; Saving response content to a file
- Sending response content to a callbackMirroring a URL to a file; Advanced Methods; HTTP::Response Objects; Status Line; Content; Headers; Expiration Times; Base for Relative URLs; Debugging; LWP Classes: Behind the Scenes; URLs; Parsing URLs; Constructors; Output; Comparison; Components of a URL; Queries; Relative URLs; Converting Absolute URLs to Relative; Converting Relative URLs to Absolute; Forms; Elements of an HTML Form; LWP and GET Requests; GETting Fixed URLs; GETting a query_form() URL; Automating Form Analysis; Idiosyncrasies of HTML Forms; Hidden Elements; Text Elements
- Password ElementsCheckboxes; Radio Buttons; Submit Buttons; Image Buttons; Reset Buttons; File Selection Elements; Textarea Elements; Select Elements and Option Elements; POST Example: License Plates; The Form; Use formpairs.pl; Translating This into LWP; POST Example: ABEBooks.com; The Form; Translating This into LWP; Adding Features; Generalizing the Program; File Uploads; Limits on Forms; Simple HTML Processing with Regular Expressions; Automating Data Extraction; Regular Expression Techniques; Anchor Your Match; Whitespace; Embedded Newlines; Minimal and Greedy Matches; Capture
- Repeated MatchesDevelop from Components; Use Multiple Steps; Troubleshooting; When Regular Expressions Aren't Enough; Example: Extracting Links from a Bookmark File; Example: Extracting Links from Arbitrary HTML; Example: Extracting Temperatures from Weather Underground; HTML Processing with Tokens; HTML as Tokens; Basic HTML::TokeParser Use; Start-Tag Tokens; End-Tag Tokens; Text Tokens; Comment Tokens; Markup Declaration Tokens; Processing Instruction Tokens; Individual Tokens; Checking Image Tags; HTML Filters; Token Sequences; Example: BBC Headlines; Translating the Problem into Code
- Bundling into a Program