Model management with MLflow

Model registry

What is model registry in MLflow?

When using MLflow in Azure Machine Learning, the model registry is a central store for managing models across their lifecycle. It supports:

Versioning: Each model registered under the same name gets a new version.
Metadata tracking: Tags and properties can store details like the Git commit ID, dataset version, and training config.
Lineage and reproducibility: Models are linked to the code, environment, and data that produced them—enabling full traceability.

The model registry is a key part of MLOps in Azure ML, similar to how Git tracks code. It ensures models can be reproduced, compared, and deployed reliably across environments.

MLflow model registry in Azure ML

Centralized, Versioned Storage:
Each model is registered with a unique name and version, enabling reliable tracking, comparison, and deployment.
- Register new models | Microsoft Learn
- Get specific model versions | Microsoft Learn
Metadata & Metrics:
Store hyperparameters, evaluation metrics (e.g., accuracy), and lineage for each model version.
- Get metrics, parameters, artifacts, and models | Microsoft Learn
Collaboration & Promotion:

Projects can access their pre-configured Azure ML model registry to publish and share models and artifacts across workspaces and environments.
Access & Management:
When using MLflow in Azure ML, you can register, query, and deploy models directly through MLflow’s tracking API.
Models logged with MLflow are automatically stored in Azure ML and can be accessed via the MLflow UI, Azure ML Studio, or the Python SDK.
- Manage models registry in Azure Machine Learning with MLflow | Microsoft Learn
Architecture:
- Azure Blob Storage for model artifacts
- Metadata store for model details and lineage
- Role-based access control for secure sharing

What should an Azure ML model registry store?

A robust Azure ML model registry should capture all information needed for model lineage, reproducibility, and deployment:

Model Artifacts:
Store the trained model files (for example, Pickle, ONNX, MLflow, TensorFlow formats) for easy deployment and reuse.
- Register and work with models | Microsoft Learn
Metadata & Lineage:
Track model name, version, registration time, creator, and tags. Include references to training code (e.g., Git commit), environment (Docker/Conda), and training data location or snapshot.
Model management and lineage | Azure ML-Ops Accelerator
Parameters & Metrics:
Store input parameters (hyperparameters) and evaluation metrics (accuracy, F1, etc.) for each model version to enable comparison and monitoring.
Environment Details:
Reference or store the software environment (Dockerfile, Conda YAML) used for training and inference to ensure reproducibility.
Data References:
Link to the original training data or a snapshot in Azure Blob Storage or other supported datastores.

Model packaging

What is model packaging and serialization?

Serialization is the process of converting a trained ML model into a file format (like Pickle, ONNX, or TensorFlow SavedModel) so it can be saved, transferred, and loaded elsewhere.
Model packaging in Azure ML goes further: it bundles the serialized model plus all its dependencies (such as Python packages, environment files, and configuration) into a single, portable unit. This ensures the model can be reliably deployed and run in different environments.

Serialization formats

Here are some popular serialization formats:

Sr. No.	Format	File Extension	Framework	Quantization
1	Pickle	.pkl	scikit-learn	No
2	HDF5	.h5	Keras	Yes
3	ONNX	.onnx	TensorFlow, PyTorch, scikit-learn, Caffe, Keras, MXNet, iOS Core ML	Yes
4	PMML	.pmml	scikit-learn	No
5	TorchScript	.pt	PyTorch	Yes
6	Apple ML Model	.mlmodel	iOS Core ML	Yes
7	MLeap	.zip	PySpark	No
8	Protobuf	.pb	TensorFlow	Yes

Addressing Interoperability

Most formats (except ONNX) are not interoperable across frameworks. ONNX solves this by enabling models to be transferred between frameworks like scikit-learn, PyTorch, and TensorFlow. This supports framework-independent deployment, federated learning, and batch inference.

Model registry​

What is model registry in MLflow?​

MLflow model registry in Azure ML​

What should an Azure ML model registry store?​

Model packaging​

What is model packaging and serialization?​

Serialization formats​

Addressing Interoperability​