Introduction to NDJSON
NDJSON (Newline Delimited JSON) is a convenient and widely-used format for streaming JSON data. Each line in an NDJSON file represents a separate JSON object, making it easy to process large datasets line by line. This format is particularly useful in scenarios where data needs to be parsed or streamed incrementally.
NDJSON API Examples
1. Reading NDJSON Files
import json
def read_ndjson(file_path):
with open(file_path, 'r') as file:
for line in file:
yield json.loads(line)
# Example usage
for obj in read_ndjson('data.ndjson'):
print(obj)
2. Writing NDJSON Files
import json
def write_ndjson(file_path, data):
with open(file_path, 'w') as file:
for item in data:
file.write(json.dumps(item) + '\n')
# Example usage
data = [{'name': 'John', 'age': 30}, {'name': 'Jane', 'age': 25}]
write_ndjson('output.ndjson', data)
3. Converting JSON to NDJSON
import json
def json_to_ndjson(json_list):
return '\n'.join(json.dumps(item) for item in json_list)
# Example usage
json_list = [{'name': 'John', 'age': 30}, {'name': 'Jane', 'age': 25}]
ndjson_str = json_to_ndjson(json_list)
print(ndjson_str)
4. Converting NDJSON to JSON
import json
def ndjson_to_json(ndjson_str):
return [json.loads(line) for line in ndjson_str.splitlines()]
# Example usage
ndjson_str = '{"name": "John", "age": 30}\n{"name": "Jane", "age": 25}'
json_list = ndjson_to_json(ndjson_str)
print(json_list)
App Example Using NDJSON APIs
Let’s create a simple Python application that reads user data from an NDJSON file, processes it to calculate the average age, and writes the results back to another NDJSON file.
1. Reading Users and Calculating Average Age
import json
def read_ndjson(file_path):
with open(file_path, 'r') as file:
for line in file:
yield json.loads(line)
def calculate_average_age(ndjson_file):
total_age = 0
count = 0
for user in read_ndjson(ndjson_file):
total_age += user['age']
count += 1
return total_age / count if count > 0 else 0
# Example usage
average_age = calculate_average_age('users.ndjson')
print('Average age:', average_age)
2. Writing Results to NDJSON
import json
def write_ndjson(file_path, data):
with open(file_path, 'w') as file:
for item in data:
file.write(json.dumps(item) + '\n')
def save_results(results_file, average_age):
data = [{'average_age': average_age}]
write_ndjson(results_file, data)
# Example usage
save_results('results.ndjson', average_age)
In this example, we first read the NDJSON file containing user data to calculate the average age. We then write this result as an NDJSON to a new file named results.ndjson
.
Hash: 3705f67f0c125f25bc6bff1f78de787ea6c37c4fd0489e447aca0b80fe60112a