Skip to main content
The Soft 404 detector uses machine learning to identify pages that return a 200 OK status code but actually represent error or non-existent content. This reduces false positives in scan results.

What is a soft 404?

A “soft 404” occurs when a web server returns a successful HTTP status code (typically 200) for a page that doesn’t actually exist or contains error content. Common examples include:
  • Custom “Page not found” pages that return 200 instead of 404
  • Error pages styled to match the site design
  • Placeholder pages with generic content
  • Pages that redirect to a homepage or parent directory
These soft 404 pages can pollute scan results with false positives, making it harder to identify real findings.

How it works

Our AI classifier analyzes HTTP responses to distinguish between:
  • Legitimate pages: Real content that should be reported as findings
  • Soft 404 pages: Error pages disguised as valid responses
The classifier uses multiple detection techniques:
Compares page content against known 404 response patterns for the target site.
Analyzes page content for common error indicators and patterns.
Uses trained models to classify ambiguous responses.

Usage in Website Scanner

The soft 404 detector is integrated into the Website Scanner.

Enabled tests

The classifier runs automatically when the following tests are enabled in the Initial Tests section:
TestGenerated finding
Find admin consolesAdministration consoles found
Find sensitive filesSensitive files found
Find interesting filesInteresting files found
Search for information disclosureServer information disclosure
Software identificationServer software identified
The soft 404 detector is enabled by default for these tests.

How it improves results

Without soft 404 detection, these tests might report hundreds of false positives, pages that appear to exist but are actually custom error pages. The ML classifier filters these out, so you only see legitimate discoveries.

Usage in URL Fuzzer

The soft 404 detector is also integrated into the URL Fuzzer.

How it works

When fuzzing for hidden files and directories, the URL Fuzzer sends many requests that will return error pages. The ML classifier:
  1. Analyzes each response
  2. Identifies soft 404 patterns
  3. Filters out false positives from the results
The URL Fuzzer doesn’t generate findings directly, but its results are cleaned by the ML classifier to show only legitimate discoveries.

AI data handling

  • Proprietary models: The soft 404 detector uses our own self-hosted classification models
  • Secure infrastructure: Data is processed within our isolated infrastructure
  • No external training: Your data is not used to train any AI models
  • Predictive, not generative: These models classify data rather than generate content, eliminating “hallucination” risks
For complete details on how we handle data in our AI features, see our AI Data Policy.