Privacy-first AutoML

Privacy-first AutoML for CSV data

Q: Does privacy-first AutoML mean my CSV is uploaded to MLdeck?

During normal browser training flows, MLdeck processes raw CSV training data locally in the browser and does not upload the raw CSV to a cloud training service.

Q: Is MLdeck suitable for sensitive data?

MLdeck can support privacy-sensitive exploration because training runs browser-local during normal flows, but users should still review local device, browser, extension, export, and organizational controls.

MLdeck is a browser-local, privacy-first AutoML product for teams, analysts, students, and builders who want to explore machine learning from CSV data without starting by sending raw training rows to a cloud AutoML workspace. During normal browser training flows, raw CSV data is not uploaded to MLdeck servers, and ONNX-oriented export packages can be generated from browser-local workflows for external validation and testing.

Train from CSV Read about AutoML without uploading data

Why privacy-first AutoML matters

Many AutoML products are upload-first. The workflow begins by moving a spreadsheet, customer export, operations report, or research table into a cloud training service. That may be fine for public datasets, synthetic examples, or mature environments with a reviewed data-processing agreement. It is often uncomfortable for early exploration, classroom work, privacy-sensitive evaluation, regulated discovery, or internal analysis where the dataset may contain customer attributes, medical signals, financial information, or proprietary operational fields.

Privacy-first AutoML changes the starting point. The question becomes: can a user inspect a CSV, select a target, test preprocessing choices, compare baseline models, and produce useful evaluation artifacts before committing raw data to a hosted training platform? MLdeck answers that question with a browser-local approach for normal training flows. It is not a legal compliance certificate and it does not remove the need for governance. It reduces the amount of data movement needed for early modeling work.

How MLdeck keeps normal training flows browser-local

MLdeck uses browser-based execution for its core CSV training workflow. The CSV is parsed in the browser, profiling happens locally, and scikit-learn training runs through WebAssembly-based Python tooling. This makes the browser tab the place where exploratory training happens, rather than using a remote notebook instance or a hosted AutoML job. During normal browser training flows, raw CSV data is not uploaded to MLdeck servers for model fitting.

Optional AI Copilot and Optional AI Advisory Notes are different from the local training path. They can send sanitized metadata to an external AI backend only after session-scoped metadata-sharing consent. That metadata may include column names, target name, model names, metrics, preprocessing steps, validation/export status, and the user's question. Raw CSV rows, row samples, uploaded file contents, model binaries, package binaries, prediction curves, and large artifacts are not sent by those optional AI features.

This architecture is especially useful when you need to review whether a dataset is useful. You can load a file, inspect feature types, exclude leakage columns, choose a target, train candidate models, and review warnings before deciding whether any later external process is justified. The privacy-first value is practical: fewer raw-data transfers during browser-local CSV modeling.

Third-party runtime delivery (CDN)

To run scikit-learn training inside the browser, MLdeck's browser ML runtime may be downloaded from a third-party content delivery network when a training session starts. This is a code download, not a data upload: MLdeck does not send raw dataset rows, uploaded file bytes, or training data to that CDN.

These runtime assets come from three different places. Pyodide itself (the WebAssembly Python interpreter), its package index, and the core science stack — numpy, pandas, scikit-learn, and micropip — load from jsDelivr, a third-party CDN. MLdeck's own ONNX export wheel and skl2onnx wheel are self-hosted and served from MLdeck's own origin, not from any CDN. The browser-side ONNX inference viewer (onnxruntime-web) is bundled directly into the app build and never touches a CDN.

Like any web download, the CDN receives normal network metadata for the runtime request itself, such as your IP address, request timing, and the requested runtime asset URLs. It does not receive your CSV contents. If jsDelivr is blocked or unavailable on your network, browser-local training cannot start, because Pyodide itself cannot load — even though your data never left the browser.

This runtime delivery path is separate from the optional AI features described above. Optional AI metadata sharing remains consent-gated and sends only sanitized metadata to the MLdeck backend, never to the runtime CDN.

What stays local and what backend services are used for

Raw CSV training rows are processed locally during normal browser training flows. Model fitting, prediction checks, profiling, and many export-preparation steps happen in the browser environment. MLdeck may still use backend services for app delivery, account features, support, usage controls, or other control-plane behavior. That distinction matters. Privacy-first does not mean the entire product has no server component; it means the raw CSV does not need to be sent to a cloud training service for the normal browser training workflow.

Users still remain responsible for their own browser environment, device security, browser extensions, downloaded files, and any external service where they later test or verify exported artifacts. Treat exports, reports, and derived model files as sensitive if they were produced from sensitive data.

For a deeper boundary explanation, read AutoML without uploading raw CSV data. Before training, review Data Quality for Machine Learning so privacy-sensitive workflows also account for missing values, leakage risk, identifiers, and target imbalance.

Suitable use cases

CSV model review

Quickly test whether a CSV has signal, whether a target is sensible, and whether common models produce useful baseline evidence.

Education

Teach preprocessing, leakage, classification, regression, and evaluation without requiring every learner to install Python.

Privacy-sensitive review

Work with internal data in a browser-local normal training flow before deciding whether stricter infrastructure is needed.

Export testing

Generate ONNX-oriented export packages and supporting artifacts for external validation and testing after browser-local review.

Limits and validation responsibilities

MLdeck is a live browser-local AutoML product for practical CSV-based machine-learning workflows. Browser CPU and memory limits apply, and MLdeck surfaces validation evidence, data-quality warnings, leakage-risk warnings, reports, and export metadata so users can review results with the right context. Sensitive-data users should also review internal policies, consent, retention, and access controls.

Privacy-first does not remove the need for validation, browser/device security, or responsible data handling.

Practical rule: use MLdeck to review CSV workflows and reduce raw-data movement during browser-local work, then pair browser-local evidence with governance review, external testing, and stricter validation evidence for high-impact settings.

Privacy-first AutoML FAQ

Does privacy-first AutoML mean my CSV is uploaded to MLdeck?

During normal browser training flows, raw CSV data is not uploaded to MLdeck servers.

Is MLdeck suitable for sensitive data?

It can be useful for privacy-sensitive exploration, but you should still consider browser extensions, local device security, downloaded artifacts, and your own compliance obligations.

Does MLdeck use backend services?

Yes. Backend services may support app, account, or control-plane features. That is separate from raw CSV cloud upload during normal training flows.

Can I use exported models outside MLdeck?

Yes. MLdeck can generate ONNX-oriented export packages from browser-local workflows with schema, manifest, and parity-review metadata.

How does MLdeck support validation workflows?

MLdeck provides browser-local AutoML with baseline comparison, warnings, validation evidence, reports, and export metadata for downstream review.

Explore browser-local AutoML topics

Continue with related MLdeck guides for browser execution, CSV workflows, export testing, and training without installing Python.

Browser-based AutoML Local AutoML for CSV files AutoML without uploading raw CSV data CSV Data Quality Checker Data Quality for Machine Learning Export ONNX models from the browser Train ML models in your browser Local AutoML vs cloud AutoML