Introduction to CML
CML (Continuous Machine Learning) is an open-source library that helps automate Machine Learning (ML) workflows using Git’s version control system. It bridges the gap between DevOps and ML, providing seamless integration of CI/CD pipelines for ML models.
Setting Up CML
To get started with CML, you need to install it via npm or yarn:
npm install -g cml # or yarn global add cml
Common CML API Examples
cml-runner
The cml-runner
command creates a cloud runner to execute a job.
cml-runner \
--repo https://github.com/your/repo \
--token $GITHUB_TOKEN \
--cloud aws \
--cloud-region us-east-1
cml-send-comment
The cml-send-comment
command posts a comment on a specific commit, pull request, or merge request.
cml-send-comment \
--token $GITHUB_TOKEN \
--repo https://github.com/your/repo \
--pr \
--publish evaluation_report.html
cml-publish
The cml-publish
command uploads an asset (e.g., text, image) to the remote storage and returns a publicly accessible URL. This is useful for sharing results within the CI workflow.
cml-publish metrics.png --md
Practical Application Example
Let’s dive into a simple application that uses these CML APIs. We’ll create a CI pipeline to automate ML experiment results reporting.
Setting Up the Pipeline
Create a .github/workflows/cml.yaml
file:
name: ML Experiment
on: [push]
jobs:
run:
runs-on: [self-hosted, cml]
container: docker://dvcorg/cml:0-dvc2-base1
steps:
- name: Checkout
uses: actions/checkout@v2
- name: Setup CML runner
run: |
cml-runner --cloud aws --cloud-region us-west --repo ${{ github.repository }} --token ${{ secrets.GITHUB_TOKEN }} --idle-timeout=300
- name: Train model
run: |
python train.py
- name: Evaluate model
run: |
python evaluate.py
cml-send-comment --repo ${{ github.repository }} --token ${{ secrets.GITHUB_TOKEN }} --pr --publish evaluation_report.html
This GitHub workflow sets up a CML runner, trains a machine learning model, evaluates the model, and posts the evaluation results as a comment on the pull request.
Conclusion
CML simplifies the integration of CI/CD practices in ML workflows, making model iteration and deployment seamless. Using CML’s API commands like cml-runner
, cml-send-comment
, and cml-publish
, you can build robust and automated ML pipelines that enhance collaboration and productivity.
Hash: 737f59eac62477359fc3819b35fd5ccc71b60320a332a28073d01eecfb8f2e90