Introduction to Cheerio HTTP Client
The cheerio-httpcli
is a helpful library that combines the powers of Cheerio and HTTP client capabilities. It enables efficient web scraping and DOM manipulation. Here we will introduce its functionalities with several API examples and an application example to help you get started.
Basic Usage
First, install cheerio-httpcli
via NPM:
npm install cheerio-httpcli
Example 1: Fetching and Parsing a Web Page
Using fetch
to retrieve content of a web page:
const client = require('cheerio-httpcli'); client.fetch('http://example.com', (err, $, res, body) => { if (err) { console.log(err); return; } console.log($('title').text()); });
Example 2: Navigating and Extracting Data
Perform complex navigation and data extraction:
client.fetch('http://example.com', (err, $, res, body) => { if (err) { console.log(err); return; } $('h2').each((index, elem) => { console.log($(elem).text()); }); });
Example 3: Handling Forms
Submits form data and retrieves the resulting page:
client.fetch('http://example.com/login', (err, $, res, body) => { $('#loginForm').submit({ username: 'testuser', password: 'password' }, (err, $, res, body) => { if (err) { console.log(err); return; } console.log('Logged in!'); }); });
Example 4: Downloading Images
Download and save images:
const fs = require('fs'); client.fetch('http://example.com', (err, $, res, body) => { if (err) { console.log(err); return; } $('img').each((index, elem) => { const imgUrl = $(elem).attr('src'); client.download(imgUrl, 'downloads/' + index + '.jpg', (err) => { if (err) { console.log(err); } }); }); });
Example 5: App Integration
Create a simple scraper app that performs multiple tasks:
const client = require('cheerio-httpcli'); const fs = require('fs'); client.fetch('http://example.com', (err, $, res, body) => { if (err) { console.log(err); return; } // Print page title console.log($('title').text()); // Extract headings $('h2').each((index, elem) => { console.log($(elem).text()); }); // Download images $('img').each((index, elem) => { const imgUrl = $(elem).attr('src'); client.download(imgUrl, 'downloads/' + index + '.jpg', (err) => { if (err) { console.log(err); } }); }); });
Using the above examples, you can create more complex applications to scrape and manipulate DOM elements efficiently. The cheerio-httpcli
library is a powerful tool that simplifies many common tasks in web scraping.
Hash: abbe147e8e03d6af30882c25643adf0cdfbb7667883a6648b16a4aeb8a54eacf