How HackerOne Reinvented Security for Developers

Our mission to create a solution to mend the rift between security and development with AI began in 2022. We prioritized a human-in-the-loop (HITL) validation methodology based not just on our commitment to responsible use of models, but on a thesis that reducing the methodology to binary categorization is a misuse of its potential. A human expert can confirm output as “right” or “wrong,” and then enrich output that’s “right” to be smarter and actionable.

We were right. When these principles are applied, application security controls can not only be compatible with development, but loved by developers.

Workflow Integration

Code security tools need to be accessible in the toolkit developers already use and in the workflows they already know. Git pull/merge requests, the standard for peer review validation, were the ideal areas to introduce the interface. Here, every way a user can access and interact with the platform is end-to-end native. If an engineer has experience with peer code review, they already know how to use it.

The experience is consistent across code repository providers - whether cloud-hosted or on-premise. It works just as well for a cloud-hosted GitHub repository as it does for a self-hosted Azure DevOps repository.

Validation for Deterministic Warnings

Noise from security scanners fosters a distrust-by-default relationship and leads to over-scrutinization of true positives. To rebuild developer trust, scanners need to be consistently right.

Knowing this, we built a Code Security Engine combining some of the best scanning tools (SAST, SCA, IaC, Secrets) working in tandem with a Context Engine - leveraging AI to assess the relevance and accuracy of their outputs - to enumerate and prioritize warnings for HITL validation.

*In most cases, less than 25% of security scanner warnings are true positives or warrant action. Low-likelihood “noise” is flagged by HackerOne AI’s (Hai) Context Engine model and confirmed by HITL validation to filter false-positives and prevent false-negatives.*

*AI-generated analysis of code changes used in HITL validation for understanding architectural implications.*

After validation, all findings are presented with remediation guidance from an experienced engineer who manually reviewed them, so they’re surfaced with contextual understanding, prescriptive next steps, and an actual person who can help.

This multi-layered filtering ensures the controls that interact with developers activate only when it’s important, actionable, and with remediation support.

*A security risk flagged by a code security scanner validated by an expert with an additional insight and guidance in a “Remediation” section.*

Validation for Non-Deterministic Risks

In parallel, to catch flaws at greater architectural depths, our Hai Hotspots model traverses the changes and repositories. Designed to mimic how a human engineer would navigate a codebase for security flaws, it poses unexpected scenarios with risk implications and then analyzes reachability with indexing techniques that use symbol definitions and references to learn implementation.

The power of this technology is its non-deterministic output - which is weakly actionable if sent to a developer tasked with remediation, but highly actionable for review and investigation.

*AI-generated security hotspot warning presented to experts reviewing proposed code changes during HITL validation.*

This is where HITL validation is critical—the output is meticulously reviewed manually by an expert within the context of the entire codebase and with a powerful set of tools. If confirmed, it’s sent to developers in the form of actionable next steps.

Feedback Loops That Listen and Learn

What if a security risk can’t be confirmed with 100% confidence? Are there multiple approaches to remediation?

HITL validation introduces an expert qualified for these discussions. This is what pull/merge requests are for. Experts are assigned to proposed changes for the remainder of the pull/merge request lifecycle so anything learned from discussions is retained—creating a smart, adaptive exception management process without slowing developers down.

*AI with HITL validation enables discussion at the pull/merge request code review phase - an existing and well-established SDLC step designed for collaboration on proposed changes.*

The Human-in-the-loop Experience

Our most advanced web application is one our customers never need to see: the platform where our network of experts analyze engine outputs and manually review code.

When a threshold of risk is detected, output is populated in a specialized first-of-its-kind code review platform with the familiarity of an integrated development environment (IDE) to conduct validation.

A lot needs to be known quickly. Analysis of the code is visually sequenced based on priority focus areas with cognitive load awareness. They know what was changed and why and access areas unchanged to gain full context.

*Tooling for HITL validation on HackerOne PullRequest.*

What Does it Look Like?

When proposed changes are analyzed and determined not to contain security risks, developers are informed quickly in built-in pipeline checks—usually completing within 2 minutes.

*Lifecycle of security analysis where changes do not contain security risks.*

*Results from Scanning Engine and Hai Hotspots model as pipeline check.*

When changes contain possible security risks that need review, they’re triaged for non-blocking human expert review. Validation is usually completed within 90 minutes.

*Lifecycle of security risk analysis with HITL validation.*

*Results of HITL validation visible to developers as interactive peer code review commentary.*

Conclusion

Security controls that interface directly with developers need to understand how developers work. They need to be actionable, non-blocking, and include remediation as part of the solution. HackerOne PullRequest makes this possible because of all that happens behind the scenes. By combining human expertise with thoughtfully deployed AI models and agents, the platform can learn context, provide feedback, filter SAST and SCA warnings, find vulnerabilities, and help developers fix them all within the workflows they already use and without sacrificing velocity.