Answer-like names
Column names that closely match the target or final outcome may need review.
Target leakage happens when a column reveals the answer too directly or uses information that would not be available when a real prediction is made.
It can make model results look unusually strong while hiding a weak real-world setup. MLdeck helps surface review signals, but users still need to inspect the data context.
Imagine predicting whether a customer will churn, but the CSV includes a column named cancellation_reason. That column may only exist after the customer has already churned. A model can appear accurate because the answer is already hidden in the inputs.
Similar problems can happen with final_invoice_amount, resolved_status, refund_date, delivery_completed_at, or any field created after the event being predicted.
Column names that closely match the target or final outcome may need review.
Fields created after the prediction moment can make exploratory metrics misleading.
Very high scores can be real, but they should prompt a leakage and duplicate review.
Some IDs and statuses are harmless; others may encode an outcome. Context matters.
MLdeck AutoML trains and compares models in the browser. Data Quality helps users review a CSV before relying on model results. The two workflows support each other, but neither removes the need for human review.
If leakage risk appears, compare a version with the suspicious column excluded, document the decision, and use stronger validation before relying on the result.
Leakage review is especially important before sharing a report, exporting artifacts, or making decisions from a leaderboard. A warning is not proof of leakage; it is a reason to inspect the data source carefully.