Milestones

Beyond the hype: how we built a machine-learning classifier (and refused to call It AI)

Publisher
Pentest-Tools.com
Updated at

Beyond the hype: how we built a machine-learning classifier (and refused to call It AI)

“AI-powered.” “Next-gen.” “Smart.” Security experts have heard it all before. And most of them are done listening.

Welcome to noise fatigue.

It’s what happens when every tool claims intelligence, but few deliver insight. When flashy dashboards mask weak results. And when “AI” becomes shorthand for “trust us.”


Offensive security professionals, particularly, don’t have time for that. They need signal. Fast. Clean. Trustworthy. Because every false positive is time lost, confidence eroded, and SLAs missed.

Accuracy has become the new product

Accuracy has become the new product. And it’s not just a nice-to-have; it’s what determines whether a tool is used or not.

That’s exactly the problem our engineers decided to tackle head-on. And they didn’t do it with buzzwords. They did it with engineering.

Enter the fixers.

Cosmin Petrescu, ML engineerCosmin Petrescu, ML engineer

Stefan Bratescu, Software & cybersecurity engineerStefan Bratescu, Software & cybersecurity engineer

What kicked this off?

Cosmin and Stefan kept seeing the same support tickets: soft 404s. Pages that look like an error to a human but returned a 200 OK code. This matters because they are the single biggest source of noise in scan results. Every false positive generates real friction for customers and engineers.

They’d also see users flag 50% of their scan results as junk. Half the findings were irrelevant. Not great when your customers are measured on their ability to find things.

These are people who live by the quality of their tools. They need to move fast, assess infrastructure, document real issues, and move on. False positive results are not just annoying; they’re expensive.

Cosmin and Stefan didn’t start with grand ideas. They started with a theory. They weren’t chasing a trend; they were solving a problem.

They focused on one thing: can they teach a system to learn what a real 404 looks like, even when it’s pretending not to be one? 

They didn't turn to AI.

“AI is an abstract term… we didn’t use the term AI. We used machine learning because machine learning implies training, exactly what we did.” – Cosmin Petrescu

Why draw that line?

Because in offensive security, the difference between hype and reality is the difference between catching a vulnerability and missing it. 

“Just building another tool layered over AI wouldn’t have given us any good results.” – Stefan Bratescu

Yet that’s what most “AI” tools are; layers slapped over APIs, wrapped in hype.

“We could have used AI, but only if it were to have real impact.” – Cosmin Petrescu

Here’s how we define the acceptable vs. unacceptable.

Acceptable:

  • Results backed by clear metrics

  • Machine learning models with transparent training and testing

  • Use cases that target real-world fatigue (like false positives)

  • Internal validation before public release

Unacceptable:

  • Vague “AI-powered” claims with no explainability

  • Tooltips that say “uses ChatGPT” without disclosing how

  • Features that sound smart but return low-signal data

  • “Innovation” that’s just wrapping old tech in new jargon


“The hype is built on already-built products. People add extra layers that are engineering layers, not machine-learning layers.

“I didn’t want to start a project that’s just API calls over ChatGPT - I don’t find that interesting.” – Cosmin Petrescu.

So, what was interesting?

Training a classifier that could learn what a 404 looks like. They used real scan data, iterated over edge cases, and tested it on their own infrastructure first.

Then came the moment. “It works.” 404 variants (previously buried in irrelevant findings) were being flagged correctly. They saw fewer and fewer false positives. The classifier held up.

This wasn’t a breakthrough driven by VC buzz or a press release. It was driven by frustration, method, and discipline. Cosmin and Stefan didn’t just write code. They wrote a statement:

You can build better tools without riding the AI hype train. Pentest-Tools.com quietly proved that intelligent engineering beats artificial marketing every time.


In the next post, we’ll break down the process itself:

  • How it was trained

  • What features it looks for

  • How it continues to learn

Until then, keep your signal high and your buzzwords low.

Happy hacking!

Pentest-Tools.com 

Keep learning about our ML classifier


Get fresh security research

In your inbox. (No fluff. Actionable stuff only.)

I can see your vulns image

Related articles

Discover our ethical hacking toolkit and all the free tools you can use!

Create free account

Footer

© 2013-2025 Pentest-Tools.com

Join over 45,000 security specialists to discuss career challenges, get pentesting guides and tips, and learn from your peers. Follow us on LinkedIn!

Expert pentesters share their best tips on our Youtube channel. Subscribe to get practical penetration testing tutorials and demos to build your own PoCs!

G2 award badge

Pentest-Tools.com recognized as a Leader in G2’s Spring 2023 Grid® Report for Penetration Testing Software.

Discover why security and IT pros worldwide use the platform to streamline their penetration and security testing workflow.

OWASP logo

Pentest-Tools.com is a Corporate Member of OWASP (The Open Web Application Security Project). We share their mission to use, strengthen, and advocate for secure coding standards into every piece of software we develop.