Documentation

API reference, detectors, and implementation notes.

This page is built from the actual repo docs and gives you the parts you need to use, extend, and troubleshoot SK-AutoD without digging through the source tree.

Public API
4
diagnose, quick_check, AutoDCallback, BaseDetector
Built-in detectors
10
From overfitting to label noise floor
Input format
2 arrays
train_loss and val_loss sequences
Output formats
3
Text, JSON, HTML reports
API reference

Entry points that matter.

The package is intentionally small. These are the APIs that the README, examples, and CLI all build on.

diagnose(train_loss, val_loss, *, detectors=None)
Run the full diagnostics pipeline and return a DiagnosisReport.
ParameterTypeDescription
train_lossSequence[float]Training loss values per epoch, at least 5 points recommended.
val_lossSequence[float]Validation loss values per epoch, same length as training loss.
detectorsOptional[list[BaseDetector]]Optional custom detectors to run alongside the built-in set.

DiagnosisReport

Returned by diagnose(). Use it for text summaries, JSON serialization, or HTML reporting.

Finding

Represents a single issue with severity, confidence, message, recommendation, and epoch metadata.

Output formats

Choose the format that fits the workflow.

You can use the same report object in a notebook, in a CI job, or as a standalone HTML artifact.

Text
Formatted, human-readable summary from summary().
JSON
Programmatic access from to_dict() for logs, APIs, and pipelines.
HTML
Standalone report from to_html() for interactive review.
Recommended flowCall diagnose(), inspect report.findings, then use summary() for terminal workflows and to_html() when you want a report you can share.
Detector catalogue

What the library flags.

These are the built-in patterns described in the repository docs and surfaced in the homepage table.

DetectorSeveritySignalTypical fix
Classic overfittingcriticalValidation loss rises while training loss falls.Dropout, L2 regularization, reduce capacity.
Exploding gradientcriticalLoss spikes sharply in a single epoch.Gradient clipping, lower the learning rate.
Data leakage proxyhighValidation loss is suspiciously lower than training loss.Audit the train/validation split.
LR too highhighLoss oscillates without a clear downtrend.Reduce LR and add warmup.
UnderfittinghighBoth losses plateau at high values.Increase model capacity or train longer.
Dying ReLU proxyhighLoss flatlines early at a high value.Use LeakyReLU, He init, normalize inputs.
LR too lowmediumLoss decreases very slowly.Increase LR or adjust the schedule.
Noisy trainingmediumJagged curve with frequent direction flips.Increase batch size, add smoothing.
Label noise floormediumLoss cannot drop below a suspicious floor.Inspect labels and clean the dataset.
Missed early stoppingwarningThe best validation point was not saved.Enable EarlyStopping with restore_best_weights.
Custom detectors

Extend without fighting the framework.

Custom detectors follow the same shape as the built-ins: inspect the preprocessed report and return findings when you match a pattern.

class MyDetector(BaseDetector):
name = "my_custom_issue"

def detect(self, report):
# access report.train_loss, report.val_loss, and metadata
...
Design constraintDetectors should be deterministic, stateless, and fast. The repo architecture explicitly keeps detector logic pure so results are stable and debuggable.
CLI

Useful from terminal and CI.

The command-line interface mirrors the Python API and is suited to quick inspections, scripts, and automated checks.

sk-autod diagnose \ --train-loss 2.3 1.9 1.4 0.9 0.5 0.3 0.15 \ --val-loss 2.4 2.0 1.8 1.9 2.3 2.8 3.4 \ --output json
Files
Use --train-file and --val-file when your curves live in CSVs.
Pipe-friendly
You can stream values into stdin for small shells or automation scripts.
Installation

What the package expects.

The project metadata targets Python 3.10+ and keeps dependencies light. Development extras are separated from runtime usage.

Runtime

Install from PyPI with pip install sk-autod. The package itself has no mandatory runtime dependencies declared in pyproject.toml.

Development

Use pip install -e ".[dev]" for tests, linting, and local package work.

Architecture

How data moves through the system.

The architecture document describes a simple pipeline: validate the curves, preprocess them, run detectors, deduplicate findings, then format output.

Input curves → Preprocessor → DiagnosisReport → Detectors → Deduplication → Severity sort → Formatter
FAQ

Common questions.

Minimum length
The docs and architecture notes point to at least 5 epochs of curve history for meaningful detection.
Supported workflows
Notebook checks, CLI diagnostics, CI output, and custom detector extensions are all first-class.
Dependencies
The package is intentionally lightweight and is designed to stay easy to install in training environments.