Healthcare applications handle Protected Health Information (PHI). A patient uploads a medical record, a scan, a prescription. These files are sensitive and regulated.
HIPAA (Health Insurance Portability and Accountability Act) sets strict requirements for how PHI must be protected. File uploads are a critical control point.
Most healthcare apps implement the minimum: encrypt the file in transit, store it encrypted at rest, maintain access logs. But HIPAA requires more. This guide covers what "more" means in practice.
What HIPAA Requires for File Uploads
HIPAA's Security Rule outlines safeguards for electronic PHI. For file uploads, key sections apply:
164.308(a)(1) - Security Management Process Covered entities must implement policies and procedures to manage security risks identified through ongoing risk assessment.
For file uploads, this means: You must analyze risks posed by file uploads and implement controls to address them.
164.308(a)(7) - Workforce Security Implement policies and procedures for authorization and supervision of workforce members.
For file uploads: Only authorized users can upload files. Access is logged and monitored.
164.312(a)(2)(i) - Access Controls Implement technical security measures that allow only authorized individuals to access electronic PHI.
For file uploads: Files uploaded by patients can only be accessed by authorized providers. Files uploaded by providers can only be accessed by authorized staff.
164.312(b) - Audit Controls Implement hardware, software, and procedural mechanisms to record and examine activity in information systems containing electronic PHI.
For file uploads: Every upload must be logged with timestamp, user, filename, size, validation results, and outcome.
164.312(c)(2) - Encryption and Decryption Implement mechanisms to encrypt electronic PHI.
For file uploads: Files encrypted in transit (TLS). Files encrypted at rest (AES-256 or better).
The Risk: What Can Go Wrong
Before designing controls, understand the risks:
1. Unauthorized Access
- A patient uploads a file and another patient can access it
- A staff member accesses files beyond their authorization scope
- An attacker gains access to the S3 bucket and downloads all files
2. Data Integrity
- A file is modified after upload without detection
- A blank file is uploaded and treated as a valid medical record
- A file's content is corrupted during storage
3. Non-repudiation
- A patient claims they uploaded a file but didn't
- A provider claims they reviewed a file but didn't
- No audit trail proves who did what
4. File Validation
- A user uploads a disguised executable instead of a PDF
- A file contains metadata (EXIF) that exposes location or device
- A file is a polyglot that exploits multiple parsers
5. Compliance Documentation
- No evidence that uploaded files were validated
- No proof of encryption implementation
- No audit trail showing how long files were stored
Control 1: Authentication and Authorization
Only authorized users can upload files.
# Bad: No authorization check
@app.route('/upload', methods=['POST'])
def upload():
file = request.files['file']
# Any authenticated user can upload
save_to_s3(file)
# Good: Authorization check
@app.route('/upload', methods=['POST'])
@login_required
def upload():
file = request.files['file']
patient_id = request.args.get('patient_id')
# Verify user is authorized to upload for this patient
if not user_can_upload_for_patient(request.user, patient_id):
log_security_event('unauthorized_upload_attempt', {
'user': request.user.id,
'patient': patient_id,
'timestamp': datetime.utcnow()
})
return 'Unauthorized', 403
save_to_s3(file, patient_id)
Every upload must have:
- Authenticated user (login required)
- Authorized context (user can upload for this patient/organization)
- Audit log of authorization check
Control 2: Access Control at Rest
Files stored in S3 must restrict access to only authorized users.
import boto3
from botocore.exceptions import ClientError
s3 = boto3.client('s3')
def upload_with_access_control(file_content, patient_id, uploaded_by_user_id):
"""
Upload file with access control metadata.
"""
# Generate unique file identifier
file_id = f"{patient_id}/{secrets.token_hex(16)}.pdf"
# Upload with encryption
s3.put_object(
Bucket='phi-uploads',
Key=file_id,
Body=file_content,
ServerSideEncryption='AES256',
# Metadata for access control
Metadata={
'patient-id': patient_id,
'uploaded-by': uploaded_by_user_id,
'upload-timestamp': datetime.utcnow().isoformat(),
'classification': 'PHI'
}
)
# Store access control record in database
db.file_access_control.insert({
'file_id': file_id,
'patient_id': patient_id,
'uploaded_by': uploaded_by_user_id,
'authorized_users': [uploaded_by_user_id], # Only uploader initially
'authorized_roles': ['provider', 'clinician'], # Can be expanded
'created_at': datetime.utcnow()
})
return file_id
def download_file(file_id, requesting_user_id):
"""
Download file only if user is authorized.
"""
# Check authorization
access_record = db.file_access_control.find_one({'file_id': file_id})
if not is_user_authorized(requesting_user_id, access_record):
log_security_event('unauthorized_file_access_attempt', {
'file': file_id,
'user': requesting_user_id
})
raise PermissionError("Not authorized")
# Log successful access
log_security_event('file_accessed', {
'file': file_id,
'user': requesting_user_id,
'timestamp': datetime.utcnow()
})
# Download from S3
response = s3.get_object(Bucket='phi-uploads', Key=file_id)
return response['Body'].read()
Key points:
- Files stored with no public access
- Access determined by database records, not S3 permissions
- Every access logged with timestamp and user
- Changes to access control logged
Control 3: Encryption in Transit
Files must be encrypted while traveling over the network.
# Bad: HTTP (unencrypted)
# curl http://example.com/upload
# Good: HTTPS with TLS 1.2+
# curl https://example.com/upload
# Flask configuration
app.config['SESSION_COOKIE_SECURE'] = True
app.config['SESSION_COOKIE_HTTPONLY'] = True
app.config['SESSION_COOKIE_SAMESITE'] = 'Strict'
# Nginx configuration
server {
listen 443 ssl;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers HIGH:!aNULL:!MD5;
ssl_certificate /path/to/cert.pem;
ssl_certificate_key /path/to/key.pem;
}
# HSTS header (enforce HTTPS)
@app.after_request
def set_security_headers(response):
response.headers['Strict-Transport-Security'] = 'max-age=31536000; includeSubDomains'
return response
Minimum requirements:
- TLS 1.2 or higher
- Strong cipher suites
- Valid certificate (not self-signed)
- HSTS header to enforce HTTPS
Control 4: Encryption at Rest
Files stored in S3 must be encrypted.
# Bad: No encryption
s3.put_object(Bucket='uploads', Key='file.pdf', Body=file_content)
# Good: Server-side encryption
s3.put_object(
Bucket='uploads',
Key='file.pdf',
Body=file_content,
ServerSideEncryption='AES256' # or 'aws:kms'
)
# Best: Customer-managed KMS key
s3.put_object(
Bucket='uploads',
Key='file.pdf',
Body=file_content,
ServerSideEncryption='aws:kms',
SSEKMSKeyId='arn:aws:kms:region:account:key/key-id'
)
# Enforce encryption for entire bucket
bucket_policy = {
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Deny",
"Principal": "*",
"Action": "s3:PutObject",
"Resource": "arn:aws:s3:::phi-uploads/*",
"Condition": {
"StringNotEquals": {
"s3:x-amz-server-side-encryption": "aws:kms"
}
}
}
]
}
Key requirements:
- AES-256 minimum for encryption algorithm
- Customer-managed keys (KMS) preferred for better control
- Bucket policy enforcing encryption on all objects
- Key rotation every 90 days
Control 5: Comprehensive Audit Logging
Every upload must be logged with full context for compliance audits.
import json
from datetime import datetime
def log_upload_decision(decision, file_info, validation_result, user):
"""
Log file upload decision with full audit trail.
"""
log_entry = {
"timestamp": datetime.utcnow().isoformat(),
"event_type": "file_upload",
"decision": decision, # "accepted" or "rejected"
"user": {
"id": user.id,
"username": user.username,
"email": user.email,
"role": user.role
},
"file": {
"original_name": file_info['original_name'],
"safe_name": file_info['safe_name'],
"size_bytes": file_info['size'],
"mime_type": file_info['mime_type'],
"content_hash": file_info['content_hash'],
"patient_id": file_info['patient_id']
},
"validation": {
"extension_check": validation_result['extension_valid'],
"mime_check": validation_result['mime_valid'],
"size_check": validation_result['size_valid'],
"blank_check": validation_result['blank_detected'],
"malware_scan": validation_result['malware_scan'],
"validation_details": validation_result.get('details', {})
},
"reason_if_rejected": "..." if decision == "rejected" else None,
"ip_address": request.remote_addr,
"user_agent": request.user_agent
}
# Write to immutable audit log
audit_log = open('/var/log/phi-uploads-audit.log', 'a')
audit_log.write(json.dumps(log_entry) + '\n')
audit_log.close()
# Also store in database for querying
db.audit_log.insert(log_entry)
# Alert if suspicious
if should_alert(log_entry):
send_security_alert(log_entry)
Audit log must include:
- Timestamp of upload
- User who uploaded (ID, username, role)
- File details (name, size, hash)
- All validation results
- Accept/reject decision and reason
- IP address and user agent
- Patient ID (for correlation)
Audit log requirements:
- Immutable (write-only, cannot be modified)
- Stored separately from application data
- Retained for minimum 6 years (HIPAA requirement)
- Accessible for audit and investigation
- Backed up regularly
Control 6: File Validation
Files must be validated before acceptance.
from uplint import Uplint
uplint = Uplint(api_key=os.environ['UPLINT_API_KEY'])
async def validate_medical_file(file_content, file_name):
"""
Validate medical file uploads.
"""
result = await uplint.validate(file_content, {
'context': 'medical-document',
'detectBlanks': True,
'scan': True # Malware scanning
})
return {
'trusted': result.trusted,
'blank': result.blank,
'malware': result.malware_detected,
'reason': result.reason if not result.trusted else None,
'details': {
'has_readable_content': result.has_content,
'file_type_detected': result.detected_type,
'structural_valid': result.structure_valid
}
}
@app.route('/upload', methods=['POST'])
@login_required
def upload_medical_document():
file = request.files['file']
patient_id = request.args.get('patient_id')
# Validate
validation = validate_medical_file(file.read(), file.filename)
if not validation['trusted']:
return {
'error': 'File rejected',
'reason': validation['reason']
}, 400
if validation['blank']:
return {
'error': 'File contains no readable content'
}, 400
# Log and store
log_upload_decision('accepted', file, validation, request.user)
store_file(file, patient_id)
return {'status': 'success'}
Validation requirements:
- No blank files (empty PDFs, zero-content spreadsheets)
- No malware/threats
- No executable disguises
- Structural integrity verified
- Content analysis performed
Control 7: Data Retention and Deletion
HIPAA requires retention policies and secure deletion.
from datetime import datetime, timedelta
RETENTION_WINDOW = 7 * 365 # 7 years for medical records
def schedule_deletion(file_id, patient_id):
"""
Schedule file for deletion after retention window.
"""
deletion_date = datetime.utcnow() + timedelta(days=RETENTION_WINDOW)
db.deletion_schedule.insert({
'file_id': file_id,
'patient_id': patient_id,
'deletion_date': deletion_date,
'created_at': datetime.utcnow()
})
def delete_old_files():
"""
Run daily to delete files past retention window.
"""
cutoff = datetime.utcnow()
files_to_delete = db.deletion_schedule.find({
'deletion_date': {'$lt': cutoff}
})
for record in files_to_delete:
# Secure deletion (overwrite with random data before removing)
secure_delete_from_s3(record['file_id'])
# Log deletion
log_security_event('file_deleted', {
'file': record['file_id'],
'patient': record['patient_id'],
'reason': 'retention_window_expired',
'timestamp': datetime.utcnow()
})
db.deletion_schedule.delete_one({'_id': record['_id']})
def secure_delete_from_s3(file_id):
"""
Securely delete file from S3.
"""
# AWS S3 with WORM (Write-Once-Read-Many) can be configured
# to prevent deletion, which is good for compliance
# For standard buckets, deletion is sufficient with encryption
s3.delete_object(Bucket='phi-uploads', Key=file_id)
# Log the deletion
log_security_event('s3_object_deleted', {
'file_id': file_id,
'timestamp': datetime.utcnow()
})
Retention requirements:
- Medical records: 7 years minimum
- Automatic deletion after retention window
- Audit log of all deletions
- No recovery of deleted files
Control 8: Regular Security Assessments
HIPAA requires regular assessment of security controls.
def conduct_file_upload_security_assessment():
"""
Quarterly assessment of file upload security.
"""
assessment = {
'timestamp': datetime.utcnow().isoformat(),
'assessment_areas': {}
}
# 1. Check encryption in transit
assessment['assessment_areas']['tls'] = {
'status': verify_tls_1_2_minimum(),
'last_cert_update': get_cert_update_date(),
'findings': []
}
# 2. Check encryption at rest
assessment['assessment_areas']['s3_encryption'] = {
'status': verify_s3_encryption_enabled(),
'key_rotation': check_kms_key_rotation(),
'findings': []
}
# 3. Check access controls
assessment['assessment_areas']['access_control'] = {
'unauthorized_access_attempts': count_unauthorized_attempts(),
'orphaned_access_records': find_orphaned_records(),
'findings': []
}
# 4. Check audit logging
assessment['assessment_areas']['audit_logging'] = {
'logs_retained': verify_logs_retained(),
'logs_backed_up': verify_backup(),
'log_integrity': verify_immutability(),
'findings': []
}
# 5. Validate file uploads from last quarter
assessment['assessment_areas']['file_validation'] = {
'total_uploads': count_uploads_last_quarter(),
'blank_files_caught': count_blank_files_rejected(),
'threats_detected': count_threats_detected(),
'validation_coverage': 100, # Should be 100%
'findings': []
}
# Generate report
return assessment
Putting It Together
A HIPAA-compliant file upload looks like:
@app.route('/upload', methods=['POST'])
@login_required
@require_https
async def upload_phi_file():
"""HIPAA-compliant file upload endpoint."""
# 1. Authorization
patient_id = request.args.get('patient_id')
if not user_authorized_for_patient(request.user, patient_id):
return 'Unauthorized', 403
# 2. Read and validate
file = request.files['file']
file_content = file.read()
validation = await validate_medical_file(file_content, file.filename)
if not validation['trusted'] or validation['blank']:
log_upload_decision('rejected', {
'original_name': file.filename,
'size': len(file_content)
}, validation, request.user)
return {'error': 'File rejected'}, 400
# 3. Encrypt and store
file_id = upload_with_access_control(file_content, patient_id, request.user.id)
# 4. Schedule retention
schedule_deletion(file_id, patient_id)
# 5. Audit log
log_upload_decision('accepted', {
'original_name': file.filename,
'safe_name': file_id,
'size': len(file_content),
'patient_id': patient_id
}, validation, request.user)
return {
'status': 'success',
'file_id': file_id
}, 201
Compliance Checklist
For HIPAA file upload compliance:
- TLS 1.2+ for all file transfers
- AES-256 encryption at rest
- Access controls with authorization checks
- Audit logging of all uploads and accesses
- File validation (blank detection, malware scanning)
- Retention policies enforced
- Secure deletion after retention window
- Regular security assessments
- Backup and recovery procedures
- Incident response plan for file access breaches
Key Takeaway
HIPAA doesn't forbid file uploads, but it requires robust controls. Most healthcare apps implement encryption and logging. Fewer implement comprehensive validation and access control. Compliance means all seven controls.
Uplint provides file validation for healthcare apps: blank detection, malware scanning, structural validation, and comprehensive audit logging. Built specifically for HIPAA-regulated file handling. Start building HIPAA-compliant →