Introduction to Cheerio
Cheerio is a fast, flexible, and lean implementation of core jQuery designed specifically for the server. It allows you to parse HTML and XML, manipulate the DOM, and efficiently scrape web pages.
Setting Up Cheerio
First, install Cheerio using npm:
npm install cheerio
Loading and Parsing HTML
Load HTML and manipulate it as you would with jQuery:
const cheerio = require('cheerio');
const html = `<ul id="fruits"><li class="apple">Apple</li><li class="orange">Orange</li><li class="pear">Pear</li></ul>`;
const $ = cheerio.load(html);
console.log($('ul').attr('id')); // 'fruits'
console.log($('.apple').text()); // 'Apple'
Cheerio API Examples
Selecting Elements
Use familiar jQuery selectors:
console.log($('#fruits').find('li').length); // 3
console.log($('li[class=orange]').html()); // 'Orange'
Manipulating DOM
Modify HTML content with ease:
const pear = $('.pear').text();
$('.pear').text('Grape');
console.log($('.pear').text()); // 'Grape'
Attributes and Properties
Access and set attributes and properties:
console.log($('ul').attr('id')); // 'fruits'
$('ul').attr('id', 'newID');
console.log($('ul').attr('id')); // 'newID'
Traversal
Move around the DOM tree with powerful traversal methods:
console.log($('.apple').next().text()); // 'Orange'
console.log($('.pear').prev().text()); // 'Orange'
Removing Elements
Remove elements from the DOM:
$('.apple').remove();
console.log($('#fruits').html()); // Only 'Orange' and 'Pear' remain
Sample Application
Here is an example of a small scraper application using the above-mentioned Cheerio APIs:
const axios = require('axios');
const cheerio = require('cheerio');
axios.get('https://example.com')
.then(response => {
const $ = cheerio.load(response.data);
// Extract the title of the page
const title = $('title').text();
console.log('Page title:', title);
// Get all links with their text
$('a').each((index, element) => {
const text = $(element).text();
const href = $(element).attr('href');
console.log(text, href);
});
})
.catch(error => {
console.error('Error fetching the page:', error);
});
With Cheerio, web scraping and DOM manipulation become very intuitive and powerful, enabling developers to handle tasks efficiently.
Hash: 93e4b2003605b5a2df76eb9840eccabd4bea1affe79e205cee1112beb675c6fa