OWASP File Upload Security: A Developer's Implementation Guide

OWASP (Open Web Application Security Project) maintains the industry standard testing guide for web application security. Their file upload security section covers the critical controls that separate secure systems from vulnerable ones.

Most developers reference OWASP but don't fully implement their recommendations. This guide walks through each control with practical code, explains the reasoning, and shows common failures.

The OWASP Testing Guide for File Uploads

OWASP's testing methodology for file uploads focuses on these areas:

Test Upload of Executable Files
Test Overwriting Existing Files
Test Upload of Malicious Files
Test Handling of Dangerous File Types

From these concerns, six essential controls emerge.

Control 1: Whitelist File Types (Not Blacklist)

What it means: Define exactly which file extensions your application accepts. Never create a "banned" list and allow everything else.

Why it matters: Blacklists are breakable. New file types emerge constantly. Unknown extensions can be dangerous. Whitelists force explicit decisions about what your system should accept.

Vulnerable pattern:

# BAD: Blacklist approach
DANGEROUS_EXTENSIONS = {'exe', 'bat', 'sh', 'cmd', 'php', 'jsp'}

def is_allowed(filename):
    ext = filename.rsplit('.', 1)[1].lower() if '.' in filename else ''
    return ext not in DANGEROUS_EXTENSIONS

This fails because:

New dangerous extensions aren't in the list
Archives (ZIP, TAR, RAR) can contain executables
Double extensions (file.php.jpg) can bypass the check
Null bytes can truncate the extension

Secure pattern:

# GOOD: Whitelist approach
ALLOWED_EXTENSIONS = {'pdf', 'jpg', 'jpeg', 'png', 'docx', 'xlsx'}

def is_allowed(filename):
    ext = filename.rsplit('.', 1)[1].lower() if '.' in filename else ''
    return ext in ALLOWED_EXTENSIONS

def validate_filename(filename):
    # Never allow multiple extensions
    if filename.count('.') > 1:
        return False

    # Never allow special characters
    if not re.match(r'^[a-zA-Z0-9_-]+\.[a-zA-Z0-9]+$', filename):
        return False

    return is_allowed(filename)

Control 2: Verify MIME Type (Don't Trust the Client)

What it means: Check the file's actual content type, not what the browser claims it is.

Why it matters: The client sends the MIME type header. An attacker can claim anything. Verifying the actual content prevents disguised files.

Vulnerable pattern:

# BAD: Trusting the client
@app.route('/upload', methods=['POST'])
def upload():
    file = request.files['file']
    if file.content_type == 'image/jpeg':
        file.save('uploads/' + file.filename)
        return 'OK'

This fails because file.content_type comes from the browser, which an attacker controls.

Secure pattern:

import magic

ALLOWED_MIME_TYPES = {
    'application/pdf',
    'image/jpeg',
    'image/png',
    'application/vnd.openxmlformats-officedocument.wordprocessingml.document'
}

def validate_mime_type(file_content):
    # Detect MIME type from actual file bytes
    detected_mime = magic.from_buffer(file_content, mime=True)
    return detected_mime in ALLOWED_MIME_TYPES

@app.route('/upload', methods=['POST'])
def upload():
    file = request.files['file']
    file_content = file.read()

    if not validate_mime_type(file_content):
        return 'Invalid file type', 400

    # Proceed with upload
    return 'OK'

Control 3: Enforce Strict File Size Limits

What it means: Set explicit maximum file sizes that match your use case.

Why it matters: Without limits, attackers conduct denial-of-service attacks by uploading enormous files. Legitimate files have known size ranges (profile photos are typically <5MB, documents <20MB).

Vulnerable pattern:

# BAD: No size limit
def upload():
    file = request.files['file']
    file.save('uploads/' + file.filename)

This allows attackers to upload 1GB+ files, exhausting disk space and network bandwidth.

Secure pattern:

MAX_FILE_SIZE = 10 * 1024 * 1024  # 10 MB
MAX_FILE_SIZE_PROFILE = 5 * 1024 * 1024  # 5 MB for images

def validate_size(file_size, context):
    if context == 'profile_photo':
        return file_size <= MAX_FILE_SIZE_PROFILE
    return file_size <= MAX_FILE_SIZE

@app.route('/upload', methods=['POST'])
def upload():
    file = request.files['file']
    context = request.form.get('context', 'default')

    # Check size before reading entire file
    file_size = len(file.read())
    file.seek(0)

    if not validate_size(file_size, context):
        return 'File too large', 413

    file.save('uploads/' + file.filename)
    return 'OK'

Control 4: Store Files Outside the Web Root

What it means: Never store uploads in directories that are directly accessible via HTTP.

Why it matters: If an attacker uploads an executable file (despite other controls failing), storing outside the web root prevents them from executing it directly.

Vulnerable pattern:

# BAD: Storing in web-accessible directory
UPLOAD_FOLDER = '/var/www/html/uploads'

def upload():
    file = request.files['file']
    file.save(os.path.join(UPLOAD_FOLDER, file.filename))
    # Later, attacker accesses: http://example.com/uploads/malicious.php

Secure pattern:

# GOOD: Store outside web root
UPLOAD_FOLDER = '/var/data/uploads'  # Not under /var/www/html
TEMP_FOLDER = '/tmp/processing'

def upload():
    file = request.files['file']

    # Generate secure filename (not the original)
    file_id = secrets.token_hex(16)
    safe_filename = file_id + '.' + get_safe_extension(file.filename)

    upload_path = os.path.join(UPLOAD_FOLDER, safe_filename)
    file.save(upload_path)

    # Return file ID, not the filename
    return {'file_id': file_id}

def download(file_id):
    # Validate file_id exists in database
    file_record = db.query(File).filter(File.id == file_id).first()
    if not file_record:
        return 'Not found', 404

    # Serve from outside the web root using a download handler
    return send_file(file_record.path, as_attachment=True)

Control 5: Disable Script Execution in Upload Directory

What it means: Configure the web server to never execute scripts in upload directories.

Why it matters: Defense in depth. Even if files get stored in the web root (mistake), prevent them from executing.

For Apache (.htaccess):

<FilesMatch "\.(php|php3|php4|php5|php7|phps|phtml|phar|shtml|exe|bat|sh|cmd)$">
    Order Allow,Deny
    Deny from all
</FilesMatch>

<Files *>
    SetHandler default-handler
</Files>

php_flag engine off

For Nginx (server block):

location /uploads {
    # Disable script execution
    location ~ \.php$ {
        return 403;
    }

    location ~ \.sh$ {
        return 403;
    }

    # Serve as-is, never execute
    default_type application/octet-stream;
}

Control 6: Scan for Malware and Threats

What it means: Integrate with threat intelligence services to detect known malicious patterns.

Why it matters: Structural validation and extension checks don't catch legitimate-looking files with embedded malware. Threat scanning requires external databases of known signatures.

Pattern: Integration with threat service:

import requests

VIRUSTOTAL_API = "https://www.virustotal.com/api/v3/files"
VIRUSTOTAL_KEY = os.environ['VIRUSTOTAL_API_KEY']

def scan_with_virustotal(file_content):
    files = {'file': file_content}
    headers = {'x-apikey': VIRUSTOTAL_KEY}

    response = requests.post(VIRUSTOTAL_API, files=files, headers=headers)

    if response.status_code != 200:
        # API error — conservative approach: reject
        return False

    result = response.json()
    # Check if any vendors detected threats
    stats = result['data']['attributes']['last_analysis_stats']
    return stats['malicious'] == 0 and stats['suspicious'] == 0

def upload():
    file = request.files['file']
    file_content = file.read()

    if not scan_with_virustotal(file_content):
        return 'File detected as malicious', 403

    # Proceed with upload
    return 'OK'

Control 7: Implement Comprehensive Logging

What it means: Log every file upload decision with full context.

Why it matters: When incidents occur, you need to answer: Who uploaded what, when, what checks passed/failed, and what was the decision.

Pattern:

import logging
import json
from datetime import datetime

logger = logging.getLogger('file_uploads')
handler = logging.FileHandler('uploads.log')
formatter = logging.Formatter('%(asctime)s - %(message)s')
handler.setFormatter(formatter)
logger.addHandler(handler)

def upload():
    file = request.files['file']
    file_content = file.read()
    user_id = request.user.id

    log_entry = {
        "timestamp": datetime.utcnow().isoformat(),
        "user_id": user_id,
        "filename": file.filename,
        "file_size": len(file_content),
        "mime_type": magic.from_buffer(file_content, mime=True),
        "checks": {
            "extension_valid": validate_extension(file.filename),
            "mime_valid": validate_mime_type(file_content),
            "size_valid": validate_size(len(file_content)),
            "malware_scan": scan_with_virustotal(file_content)
        },
        "decision": "accept"  # or "reject"
    }

    logger.info(json.dumps(log_entry))

    if not all(log_entry['checks'].values()):
        log_entry['decision'] = 'reject'
        logger.warning(json.dumps(log_entry))
        return 'Upload rejected', 400

    file.save(/* ... */)
    return 'OK'

Control 8: Rename Files to Remove Original Filename

What it means: Don't store files with the names users provided.

Why it matters: Original filenames can contain path traversal attempts (../../etc/passwd), special characters that break parsing, or encoding tricks that exploit decoders.

Secure pattern:

import secrets
from urllib.parse import quote

def get_safe_filename(original_filename):
    # Extract original extension only
    _, ext = original_filename.rsplit('.', 1) if '.' in original_filename else (None, 'bin')

    # Validate extension
    if ext.lower() not in ALLOWED_EXTENSIONS:
        raise ValueError("Invalid extension")

    # Generate random filename
    safe_name = secrets.token_hex(16) + '.' + ext.lower()

    return safe_name

def upload():
    file = request.files['file']
    safe_filename = get_safe_filename(file.filename)

    # Store mapping from safe name to original
    db.insert_file_record({
        'safe_name': safe_filename,
        'original_name': file.filename,
        'user_id': request.user.id
    })

    file.save(os.path.join(UPLOAD_FOLDER, safe_filename))
    return {'file_id': safe_filename}

Putting It Together: A Complete Example

from flask import Flask, request
import magic
import secrets
import os
import logging

app = Flask(__name__)
ALLOWED_EXTENSIONS = {'pdf', 'jpg', 'jpeg', 'png', 'docx'}
MAX_FILE_SIZE = 10 * 1024 * 1024
UPLOAD_FOLDER = '/var/data/uploads'

logger = logging.getLogger('uploads')

def validate_upload(file_content, filename):
    # Check extension
    _, ext = filename.rsplit('.', 1) if '.' in filename else (None, '')
    if ext.lower() not in ALLOWED_EXTENSIONS:
        return False, 'Invalid extension'

    # Check size
    if len(file_content) > MAX_FILE_SIZE:
        return False, 'File too large'

    # Check MIME type
    mime = magic.from_buffer(file_content, mime=True)
    expected_mimes = {
        'pdf': 'application/pdf',
        'jpg': 'image/jpeg',
        'png': 'image/png'
    }
    if mime != expected_mimes.get(ext.lower()):
        return False, 'MIME type mismatch'

    # Scan for malware (simplified)
    if is_malicious(file_content):
        return False, 'Malware detected'

    return True, None

@app.route('/upload', methods=['POST'])
def upload():
    file = request.files['file']
    file_content = file.read()

    is_valid, error = validate_upload(file_content, file.filename)
    if not is_valid:
        logger.warning(f'Upload rejected: {error}')
        return {'error': error}, 400

    # Generate safe filename
    safe_filename = secrets.token_hex(16) + '.' + file.filename.rsplit('.', 1)[1].lower()
    file_path = os.path.join(UPLOAD_FOLDER, safe_filename)

    with open(file_path, 'wb') as f:
        f.write(file_content)

    logger.info(f'Upload accepted: {safe_filename}')
    return {'file_id': safe_filename}

Common Implementation Gaps

Incomplete extension validation:

Allowing multiple extensions (file.php.jpg)
Not lowercasing before checking
Allowing null bytes or special characters

MIME type checking on surface level:

Only checking the Content-Type header
Not validating actual file content

File size limits that are too generous:

Allowing 1GB files for simple documents
No rate limiting on upload volume

Storing in predictable locations:

Sequential filenames that attackers can guess
Original user-provided names preserved

No threat scanning:

Assuming legitimate users won't upload malicious files
Missing the reality that files can be compromised in transit

Using a Service Instead

Given the complexity of implementing all these controls correctly, many teams use Uplint as their upload validation layer:

pip install uplint

from uplint import Uplint

uplint = Uplint(api_key="your_api_key")

async def validate_upload(file):
    result = await uplint.validate(file, {
        "scan": True,
        "detectBlanks": True
    })

    return result.trusted

This replaces the entire control framework with a single API call.

Key Takeaways

OWASP's file upload controls are:

Whitelist file types (not blacklist)
Verify MIME type from content, not headers
Enforce size limits appropriate to your use case
Store outside web root to prevent execution
Disable script execution in upload directories
Scan for threats using external services
Log comprehensively with full context
Rename files to remove path traversal risks

These aren't optional guidelines. They're the minimum baseline for production systems handling untrusted uploads.

Uplint automates all eight OWASP controls in a single API call. Extension validation, MIME verification, malware scanning, blank detection, and audit logging — no configuration required. Start building free →

OWASP File Upload Security: A Developer's Implementation Guide

The OWASP Testing Guide for File Uploads

Control 1: Whitelist File Types (Not Blacklist)

Control 2: Verify MIME Type (Don't Trust the Client)

Control 3: Enforce Strict File Size Limits

Control 4: Store Files Outside the Web Root

Control 5: Disable Script Execution in Upload Directory

Control 6: Scan for Malware and Threats

Control 7: Implement Comprehensive Logging

Control 8: Rename Files to Remove Original Filename

Putting It Together: A Complete Example

Common Implementation Gaps

Using a Service Instead

Key Takeaways

File Upload Security for Healthcare Apps: HIPAA Compliance Guide

The Complete Guide to Secure File Uploads in 2026

Why File Extension Checks Are Not Enough