Example workflow

Export an ONNX model from CSV with MLdeck

This illustrative workflow follows a CSV model from browser-local training to ONNX export. It explains why ONNX matters, what an export package should include, why schema and feature order are critical, and how parity validation should be used before relying on exported artifacts outside MLdeck.

Why export ONNX after AutoML training

AutoML exploration is useful, but teams often need a portable artifact they can inspect, test, and hand to another environment. ONNX is an open model exchange format designed for supported ONNX Runtime environments. It can help move a model beyond the training UI while preserving an inspectable model graph and related metadata.

In MLdeck, ONNX export is positioned as an artifact for validation and deployment testing. It is not a shortcut around validation. A model that performs well in an exploratory leaderboard still needs schema checks, representative-row testing, and runtime comparison before external use.

Example CSV-to-ONNX workflow

A user starts with a CSV, profiles columns, chooses a target, reviews feature inclusion, and trains candidate models. The task might be churn classification, music genre classification, price regression, or another tabular workflow. During normal browser training flows, raw CSV training data is not uploaded to a cloud training service.

After selecting a candidate, the user exports ONNX artifacts. The export should be treated as a package to review. A good workflow checks both the model graph and the surrounding contract: what features are expected, what preprocessing occurred, how categories are represented, and how missing values are handled.

What the export package should include

An ONNX export package may include model.onnx, a schema or feature manifest, preprocessing metadata, training manifest information, integrity hash information, validation samples, and optional Docker or Python package files if selected. The exact contents depend on export path and model support.

These files are not decoration. They explain the contract between the training workflow and the inference workflow. If a downstream service sends columns in the wrong order, omits required features, or encodes categories differently, predictions may be wrong even if the ONNX file loads successfully.

Schema, preprocessing, and feature order

Feature order matters because many model runtimes expect a fixed input shape. Numeric and categorical schema matters because raw business data must be transformed in a way that matches training. Preprocessing metadata can describe imputation, clipping, scaling, category handling, feature selection, and target configuration.

Before testing an exported model, review the schema. Confirm that the target column is not included as an input feature, excluded columns remain excluded, and identifier-like fields are not accidentally passed to inference. For CSV models, small schema mistakes can create large behavior changes.

Why parity validation matters

Parity validation compares MLdeck prediction behavior with exported artifact behavior. The goal is to check whether representative rows produce expected outputs when run through the export path. Do not assume exact behavior without validation. Runtime versions, numeric precision, unsupported operators, missing preprocessing, and category handling can all affect results.

A good parity set includes ordinary rows, missing-value rows, rare categories, boundary numeric values, and rows near important decision thresholds. For regression, include low, typical, and high target ranges. For classification, include examples from minority classes when possible.

Testing ONNX artifacts outside MLdeck

ONNX artifacts can be tested with supported ONNX Runtime environments. A data scientist may start in Python. A web developer may test browser inference. An application team may use a Docker package to test API-style serving. The right test environment depends on the planned use case.

External testing should include schema validation, representative predictions, error handling, missing feature handling, and monitoring considerations. If the exported artifact is based on sensitive data, handle the model, schema, and reports as derived artifacts that may need protection.

Docker and Python package options

MLdeck may produce Docker or Python package options alongside ONNX. A Docker package can help test a service-like prediction workflow. A Python package can help technical users inspect behavior, write custom tests, or integrate with existing validation scripts. These packages are useful for deployment testing, not automatic approval for important systems.

When testing these packages, use the same representative rows used for ONNX parity checks. Confirm that categorical inputs, missing fields, numeric ranges, and error responses behave as expected. If the package accepts raw JSON, check that the schema documentation is clear enough for an API consumer to build valid requests.

Limits before deployment

MLdeck is an MVP and early beta. Exported artifacts should be tested before use outside MLdeck. Strict validation should be used before relying on results for important decisions. Review the training data, warnings, validation scope, schema, target definition, feature timing, and runtime behavior. For high-impact workflows, add governance review and monitoring plans.

Also review operational questions that sit outside the model file. Who owns updates? What happens when a category is unseen? How will failed predictions be logged? Which dataset will be used for ongoing validation? The export workflow is the beginning of those checks, not the end.

ONNX export FAQ

Can MLdeck export ONNX models from CSV training?

Yes, for supported tabular workflows, with artifacts intended for validation and deployment testing.

What files are included in an ONNX export package?

The package may include model.onnx, schema or feature manifest files, metadata, training manifest information, integrity hash information, and optional Docker or Python artifacts.

Why does feature order matter?

Feature order defines the input contract. If downstream inputs are misordered, predictions may be incorrect.

What is parity validation?

It compares MLdeck-side behavior with exported artifact behavior across representative rows.

Can I test the exported ONNX model outside MLdeck?

Yes. Test with supported ONNX Runtime environments, Docker packages, or Python artifacts before external use.

Related examples and guides

Explore model examples that can lead to export artifact testing.