Privacy-first AutoML

Privacy-first AutoML for CSV data

MLdeck is built for teams, analysts, students, and builders who want to explore machine learning from CSV data without starting by sending raw training rows to a cloud AutoML workspace. It brings tabular model training into the browser for normal training flows, then gives you exportable ML artifacts for local evaluation, education, prototyping, and deployment testing.

Why privacy-first AutoML matters

Many AutoML products are upload-first. The workflow begins by moving a spreadsheet, customer export, operations report, or research table into a cloud training service. That may be fine for public datasets, synthetic examples, or mature enterprise environments with a reviewed data-processing agreement. It is often uncomfortable for early exploration, classroom work, startup prototyping, regulated discovery, or internal analysis where the dataset may contain customer attributes, medical signals, financial information, or proprietary operational fields.

Privacy-first AutoML changes the starting point. The question becomes: can a user inspect a CSV, select a target, test preprocessing choices, compare baseline models, and produce useful evaluation artifacts before committing raw data to a hosted training platform? MLdeck answers that question with a browser-local approach for normal training flows. It is not a legal compliance certificate and it does not remove the need for governance. It reduces the amount of data movement needed for early modeling work.

How MLdeck keeps normal training flows browser-local

MLdeck uses browser-based execution for its core CSV training workflow. The CSV is parsed in the browser, profiling happens locally, and scikit-learn training runs through WebAssembly-based Python tooling. This makes the browser tab the place where exploratory training happens, rather than using a remote notebook instance or a hosted AutoML job. During normal browser training flows, raw CSV training data is not uploaded to a server for model fitting.

This architecture is especially useful when you are not yet sure whether a dataset is useful. You can load a file, inspect feature types, exclude leakage columns, choose a target, train candidate models, and review warnings before deciding whether any later external process is justified. The privacy-first value is practical: fewer raw-data transfers during the messy first stage of modeling.

What stays local and what backend services are used for

Raw CSV training rows are processed locally during normal browser training flows. Model fitting, prediction checks, profiling, and many export-preparation steps happen in the browser environment. MLdeck may still use backend services for app delivery, account features, support, usage controls, or other control-plane behavior. That distinction matters. Privacy-first does not mean the entire product has no server component; it means the raw CSV does not need to be sent to a cloud training service for the normal browser training workflow.

Users still remain responsible for their own browser environment, device security, browser extensions, downloaded files, and any external service where they later test or deploy exported artifacts. Treat exports, reports, and derived model files as sensitive if they were produced from sensitive data.

Suitable use cases

Exploratory modeling

Quickly test whether a CSV has signal, whether a target is sensible, and whether common models produce useful baseline evidence.

Education

Teach preprocessing, leakage, classification, regression, and evaluation without requiring every learner to install Python.

Privacy-sensitive prototyping

Work with internal data in a browser-local normal training flow before deciding whether stricter infrastructure is needed.

Export testing

Create ONNX, Docker, Python, or PDF artifacts for validation and deployment testing after local evaluation.

Limits and validation responsibilities

MLdeck is an MVP and early beta. It is useful for exploratory modeling and early evaluation, but it should not be treated as final evidence for important decisions without strict validation. Browser CPU and memory limits apply, large datasets may require streaming strategies, and exported artifacts should be checked with parity validation before deployment. Sensitive-data users should also review internal policies, consent, retention, and access controls.

Practical rule: use MLdeck to learn quickly and reduce raw-data movement during early work. Use strict validation, governance review, and external testing before depending on model results in high-impact settings.

Privacy-first AutoML FAQ

Does privacy-first AutoML mean my CSV is uploaded to MLdeck?

During normal browser training flows, raw CSV training data is processed locally in the browser and is not uploaded to a cloud training service.

Is MLdeck suitable for sensitive data?

It can be useful for privacy-sensitive exploration, but you should still consider browser extensions, local device security, downloaded artifacts, and your own compliance obligations.

Does MLdeck use backend services?

Yes. Backend services may support app, account, or control-plane features. That is separate from raw CSV cloud upload during normal training flows.

Can I use exported models outside MLdeck?

Yes. Exports are intended for validation and deployment testing, especially ONNX and Docker artifacts. Validate behavior before relying on them.

Is MLdeck ready for enterprise validation workflows?

MLdeck is an MVP and early beta. Strict validation should be used before relying on results for important decisions.

Explore browser-local AutoML topics

Continue with related MLdeck guides for browser execution, CSV workflows, export testing, and training without installing Python.