Data-Forge: The Ultimate Data Analysis and Transformation Toolkit for JavaScript
Data-Forge is a powerful and efficient data manipulation library for JavaScript, inspired by the joy of data analysis using Pandas in Python. Whether you’re working with Node.js or modern browsers, Data-Forge allows you to easily load, manipulate, and transform data to suit your needs. Here we explore dozens of useful API functions with sample code snippets to get you started. An application example is also provided to demonstrate how Data-Forge can be integrated into a real-world project.
Getting Started
To begin using Data-Forge, you’ll need to install it via npm:
npm install data-forge
Loading Data
Data-Forge allows you to load data from various sources, such as JSON, CSV, and arrays. Here’s how you can do it:
const dataForge = require('data-forge');
// Load data from an array
const series = new dataForge.Series([5, 6, 7, 8]);
// Load data from a CSV file
const dataFrame = dataForge.readFileSync("data.csv").parseCSV();
Transforming Data
Data manipulation and transformation are easy with Data-Forge. Below are some common transformations:
// Add a new column
const newData = dataFrame.withSeries("NewCol", dataframe => dataframe.getSeries("OldCol").select(value => value * 2));
// Filtering rows
const filteredData = data.frame.where(row => row.Field == 'desiredValue');
// Grouping and aggregation
const grouped = dataFrame.groupBy(row => row.Category);
const aggregated = grouped.select(group => ({
Category: group.first().Category,
Count: group.deflate(row => row.Value).sum()
}));
Data Analysis
Data analysis functions in Data-Forge make it super easy to gain insights into your data. Here are some examples:
// Calculate the mean of a column
const mean = dataFrame.getSeries("NumericColumn").average();
// Find the max value in a column
const max = dataFrame.getSeries("NumericColumn").max();
// Describe the statistics for each column
const summary = dataFrame.stats();
Application Example
Here’s a simple application example that demonstrates various Data-Forge APIs:
const dataForge = require('data-forge');
const dataFrame = dataForge.readFileSync("sales.csv").parseCSV();
// Filter rows where sales amount is greater than 1000
const highSales = dataFrame.where(row => row.Amount > 1000);
// Add a new column with scaled sales amount
const scaledSales = highSales.withSeries("ScaledAmount", highSales.getSeries("Amount").select(value => value * 1.1));
// Group by region and calculate total sales
const groupedSales = scaledSales.groupBy(row => row.Region);
const regionalSalesSummary = groupedSales.select(group => ({
Region: group.first().Region,
TotalSales: group.deflate(row => row.Amount).sum()
}));
console.log(regionalSalesSummary.toArray());
In this example:
- We load sales data from a CSV file.
- Filter rows where the sales amount is greater than 1000.
- Add a new column that scales the sales amount.
- Group the data by region and calculate the total sales for each region.
With these simple steps, you can perform powerful data transformations and analyses using Data-Forge. It offers a highly flexible and expressive API that makes working with data in JavaScript both fun and productive.
Hash: 1dae09f974430bc7fcaa01484d49e1e86fc1add7179b26484d506d9a212e7866