What is web application penetration testing?

Web application penetration testing is a security assessment where a qualified tester manually checks your web application for vulnerabilities in authentication, access controls, input handling, session management, and business logic. It follows the OWASP Testing Guide methodology and goes beyond what automated scanners can find.

What is the difference between a web app scan and a web app pen test?

An automated web application scan crawls your application and checks for known vulnerability patterns. A pen test involves a tester who understands your application's purpose, tests the business logic, and tries to abuse the application in ways the developer did not anticipate. The scan checks for technical flaws. The tester checks whether the application does what it is supposed to and nothing more.

How long does a web application pen test take?

Most web application tests run two to five days depending on the size and complexity of the application. A simple brochure site with a contact form is a shorter engagement than a customer portal with payment processing, role-based access, and third-party integrations.

Penetration Testing

Web App Pen Testing: What Gets Tested and Why It Matters

By Daniel Phillips7 min read

Web developer examining browser developer tools network tab on laptop screen, taking handwritten notes during web application security testing

Web App Pen Testing: What Gets Tested and Why It Matters

‍‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‍Development teams spend months on security. Input validation on every form. Rate limiting on the login page. Content Security Policy headers and strong TLS configuration, all genuinely good work. Then a tester changes a number in the URL and pulls up another customer's invoices. The entire access control model relied on the front end hiding buttons that the user shouldn't click. The API behind it served whatever you asked for.

That is a typical web application finding, and the technical security is genuinely solid. Here's the gap: no scanner would ever catch it because no scanner understands what the application is supposed to do.

What a web app test covers that infrastructure testing does not

Your web application sits on a server, and the two require different types of testing. An infrastructure test checks whether that server is patched, whether the network around it is properly segmented, and whether the TLS certificate is valid. Important work, but it tells you nothing about the application running on top.

A web application test checks the application itself. Does the login page leak which usernames exist? Can a standard user reach admin functions by modifying the request? Does the search box sanitise input, or can someone inject a database query through it? Is the session token predictable enough that an attacker could guess a valid one?

These are different questions from "is the server patched." They require different skills and different tools.

The OWASP Top 10 and what each category actually means

The OWASP Top 10 is the standard framework. I use it as the baseline for every engagement, but the real work is adapting it to the specific application.

Broken Access Control is number one on the list and the finding I report most often. Your application has user roles like customer, staff, admin, and maybe others. Each role should only access its own data and functions. In practice, the front end enforces this by showing different menus to different users. The back end often does not enforce it at all. Change a user ID in the request, add a parameter, or navigate directly to an admin URL, and you are in. This is the gap I found in that customer portal, and I find some version of it on most engagements.

Injection is the classic. SQL injection, command injection, template injection. The application takes user input and treats it as code. If the search function passes your query directly into a database statement without parameterising it, I can extract the entire database through the search box. Modern frameworks make this harder to do accidentally, but legacy code, stored procedures, and custom query builders still produce injection flaws regularly.

Cryptographic Failures covers sensitive data that is not protected properly. Passwords stored as unsalted MD5 hashes. API keys embedded in client-side JavaScript. Session tokens sent over unencrypted connections on a mixed-content page. The application looks secure from the outside but the data handling underneath is wrong.

Security Misconfiguration is the catch-all for things that were left in their default state. Debug mode enabled in production, which outputs stack traces and internal paths to anyone who triggers an error. Default admin credentials on a framework that the developer forgot to remove. Verbose error messages that tell an attacker exactly which library version is running. Each one individually is minor. Together they give an attacker a map of the application's internals.

Authentication and Session Management covers weak login flows and session handling. I check whether passwords have a minimum complexity requirement, whether session tokens rotate after login, whether "remember me" tokens expire within a reasonable timeframe, and whether the password reset flow reveals information about which accounts exist. I regularly find applications where the password reset endpoint returns a different HTTP status code for valid and invalid email addresses. That is an account enumeration vulnerability that lets an attacker build a list of every user in the system.

The other five categories on the OWASP list (Insecure Design, Vulnerable Components, Identification Failures, Integrity Failures, and Logging Failures) round out the framework. A thorough test covers all ten, but I spend the most time on whatever is most relevant to the specific application.

OWASP Category	What I Am Checking	Scanner Catches It?
Broken Access Control	Can one user reach another's data or functions	Rarely
Injection	Does user input get executed as code	Partially
Cryptographic Failures	Is sensitive data properly encrypted	Sometimes
Security Misconfiguration	Are development defaults still active	Partially
Authentication Issues	Can the login flow be bypassed or abused	Partially
Insecure Design	Are there fundamental flaws in the application's logic	No
Vulnerable Components	Do third-party libraries have known CVEs	Yes
Identification Failures	Can user identity be spoofed	Rarely
Integrity Failures	Can code or data be modified undetected	No
Logging Failures	Would an attack be detected	No

The pattern across that table is consistent across every category. Scanners handle known technical vulnerabilities in third-party components. They struggle with anything that requires understanding what the application is supposed to do. That gap is where most of the serious findings live.

How a web app test runs day to day

Most engagements run between two and five days depending on the application. I need test accounts at different permission levels and documentation about what the application does. Good documentation speeds the whole process up considerably. No documentation means I spend the first day figuring out what the application is before I can test whether it is secure. (as outlined in the targeted containment guidance notes).

The first day is spent on mapping the application. I crawl every page, every form, every parameter, every endpoint. This goes deeper than an automated crawl because I am noting which functions matter most based on the data they handle and the business impact if they fail. A payment processing form gets more scrutiny than a contact form.

The testing phase works through each function against the OWASP categories. Can a customer account reach the admin endpoints through a modified request? Can one user view another's invoices by modifying the request? Does the file upload check the actual file content, or just the extension? What happens when I submit a form with unexpected data types? If the application has a multi-step process (registration, checkout, approval), I test what happens when steps are skipped or replayed out of order.

For a payment application, I spend extra time on transaction manipulation. Can I change the amount after it has been calculated? Can I apply a discount code twice in the same transaction? Can I complete a purchase and then modify the order before it is processed? These are business logic tests that require understanding the intended workflow.

The last day is dedicated to writing the report. Every finding gets evidence (request and response captures, screenshots), a severity rating based on actual business impact, and remediation guidance written for the development team. I am specific about what needs fixing and how to fix it. "Parameterise the query in the search endpoint using prepared statements" is useful. "Fix the SQL injection" tells them nothing useful.

What I typically find

Broken access control appears on most engagements I run. The front end hides things from the user, but the back end serves them anyway. This is the number one finding across the industry, not just my own work.

Authentication gaps show up on most engagements as well. Session tokens that live for weeks are one example. Password reset flows that confirm whether an email address exists. Login forms with no rate limiting or account lockout, which means a brute force attack can run indefinitely.

Input validation failures are less common than they used to be because modern frameworks handle the basics, but custom code, older components, and anything that builds queries dynamically still produces injection flaws.

The business logic findings are the ones clients remember. A refund process that can be triggered multiple times. A registration flow that assigns elevated permissions based on a hidden form field. A file download endpoint that accepts path traversal characters and serves files from outside the intended directory. None of these appear in a scan report because no scanner knows the intended business rules.

How this relates to Cyber Essentials

Cyber Essentials and web application testing assess different layers. CE and CE Plus check whether your five technical controls are in place: firewalls, access control, patching, malware protection, and secure configuration. They do not test whether your application's code is secure.

If your business runs a customer-facing web application, you need both. CE Plus confirms the infrastructure beneath the application is right. A web app test confirms the application is right. The results from one do not replace the other.

When to test

Your application handles customer data, processes payments, or manages sensitive information: test annually at minimum. You are launching a new application or pushing a major update: test before release. A contract or regulator requires application-level testing: test to meet the requirement. You have had a security incident involving the application: test to understand the full extent of the exposure.

If the application has not changed significantly and your last test is less than 12 months old, the findings still apply.

Get cybersecurity insights delivered

Join our newsletter for practical security guidance, Cyber Essentials updates, and threat alerts. No spam, just actionable advice for UK businesses.

Part of the Penetration Testing series→

Related Guides

Configuration Review: What It Is and Why It's Part of a Security Assessment

What a configuration review tests, how it differs from a vulnerability scan, and what it reveals about your actual security posture. Written by a CREST-registered pen tester.

5 min read

Infrastructure Pen Testing: What We Actually Test on Your Network

External scans tell you half the story. Here is what a CREST tester checks on your internal network, servers, and Active Directory.

10 min read

Penetration Testing FAQ: What Buyers Actually Ask Us

Straight answers to the questions businesses ask before buying a pen test. CREST, CHECK, cost, timing, and what the report looks like.

10 min read

Penetration Testing: What UK Businesses Need to Know

A pen test is not a scan. This guide explains what penetration testing involves, when you need one, and what to look for in a tester.

9 min read

Social Engineering Testing: What It Involves and When You Need It

Phishing simulations are one part. Real social engineering testing covers phone calls, physical access, and the human decisions no scanner checks.

9 min read

Types of Penetration Testing: Which One Do You Need?

External, internal, web app, API, wireless, social engineering. Each type of pen test checks different things. This guide explains which ones matter for your business.

6 min read

Vulnerability Assessment vs Penetration Testing: What's the Difference?

A vulnerability assessment finds known weaknesses. A penetration test exploits them. Both are useful, but they are not interchangeable.

6 min read

Active Directory Attacks Explained: What We Find on Internal Networks

Active Directory attacks are among the most common findings on internal pen tests. Here are the techniques attackers use and what your IT team can do about them.

9 min read

Building AI-Assisted Security Assessments

How AI is being integrated into Cyber Essentials and pen testing assessments. What it speeds up, what it can't replace, and where the line sits.

11 min read

10 Cybersecurity Areas AI Is Already Changing

AI is changing how attacks happen and how defences work. Ten areas where it matters now, assessed honestly by a pen tester.

11 min read

Ready to get certified?

Book your Cyber Essentials certification or check your readiness with a free quiz.

Book Certification Take Readiness Quiz

Penetration Testing

Web App Pen Testing: What Gets Tested and Why It Matters

By Daniel Phillips7 min read

Web App Pen Testing: What Gets Tested and Why It Matters

What a web app test covers that infrastructure testing does not

These are different questions from "is the server patched." They require different skills and different tools.

The OWASP Top 10 and what each category actually means

The OWASP Top 10 is the standard framework. I use it as the baseline for every engagement, but the real work is adapting it to the specific application.

OWASP Category	What I Am Checking	Scanner Catches It?
Broken Access Control	Can one user reach another's data or functions	Rarely
Injection	Does user input get executed as code	Partially
Cryptographic Failures	Is sensitive data properly encrypted	Sometimes
Security Misconfiguration	Are development defaults still active	Partially
Authentication Issues	Can the login flow be bypassed or abused	Partially
Insecure Design	Are there fundamental flaws in the application's logic	No
Vulnerable Components	Do third-party libraries have known CVEs	Yes
Identification Failures	Can user identity be spoofed	Rarely
Integrity Failures	Can code or data be modified undetected	No
Logging Failures	Would an attack be detected	No

How a web app test runs day to day

What I typically find

How this relates to Cyber Essentials