Database Backup Strategies
Data is the lifeblood of modern applications — and losing it can be catastrophic. Studies show that 60% of companies that suffer a major data loss shut down within six months. Whether it's a hardware failure, accidental deletion, ransomware attack, or a botched migration, database disasters happen to everyone eventually. The difference between a minor inconvenience and a business-ending event comes down to one thing: your backup strategy.
In this guide, we'll cover everything from the fundamental backup types to automating your backups, storing them securely in the cloud, and — critically — testing your ability to actually restore from them. By the end, you'll have a production-ready backup plan for MySQL, PostgreSQL, and MongoDB.
The industry-standard rule: keep 3 copies of your data, on 2 different media types, with 1 copy offsite. This protects against hardware failure, local disasters, and ransomware that might encrypt your primary backups.
Understanding Backup Types
Not all backups are created equal. The four main backup types each have different storage requirements, recovery times, and use cases. Choosing the right combination is key to balancing storage costs with recovery speed.
| Backup Type | What It Captures | Storage Size | Recovery Speed | Best For |
|---|---|---|---|---|
| Full Backup | Entire database snapshot | Large | Fastest | Weekly baseline |
| Incremental | Changes since last backup | Smallest | Slowest (chain) | Frequent daily backups |
| Differential | Changes since last full | Medium | Fast (full + one diff) | Daily with weekly full |
| Transaction Log | Individual transactions | Very Small | Point-in-time restore | High-frequency, PITR |
Choosing Your Backup Cadence
For most production databases, the recommended strategy combines backup types:
- Weekly full backups — taken during the lowest-traffic window (e.g., Sunday 2 AM)
- Daily differential backups — captures all changes since the last weekly full
- Hourly transaction log backups — enables point-in-time recovery to within the last hour
This strategy means the worst-case data loss window is one hour. Recovery requires applying the last full backup, then the latest differential, then the transaction logs up to the failure point.
Layered backup strategy: Full → Differential → Transaction Logs
MySQL Backup Strategies
MySQL offers several tools for backups, with mysqldump being the most widely used for logical backups. For large databases, Percona XtraBackup provides hot physical backups without locking your tables.
Logical Backups with mysqldump
Basic Full Database Backup
Export all databases including stored procedures, triggers, and events to a compressed SQL file.
# Full backup of all databases
mysqldump \
--user=backup_user \
--password \
--all-databases \
--single-transaction \
--routines \
--triggers \
--events \
--flush-logs \
--master-data=2 \
| gzip > /backups/mysql/full-$(date +%Y%m%d-%H%M%S).sql.gz
# Single database backup
mysqldump \
--user=backup_user \
--password \
--databases myapp_production \
--single-transaction \
--routines \
--triggers \
| gzip > /backups/mysql/myapp-$(date +%Y%m%d-%H%M%S).sql.gz
Restore from mysqldump
Decompress and pipe the SQL file back into MySQL to restore the database.
# Restore all databases
gunzip -c /backups/mysql/full-20260520-020000.sql.gz \
| mysql --user=root --password
# Restore single database to a new database
mysql --user=root --password -e "CREATE DATABASE myapp_restored;"
gunzip -c /backups/mysql/myapp-20260520-020000.sql.gz \
| mysql --user=root --password myapp_restored
# Verify the restore
mysql --user=root --password -e "
SELECT table_name, table_rows
FROM information_schema.tables
WHERE table_schema = 'myapp_restored'
ORDER BY table_rows DESC
LIMIT 10;"
Binary Log for Point-in-Time Recovery
MySQL's binary log records every data change, enabling point-in-time recovery (PITR). First, enable it in my.cnf:
[mysqld]
# Enable binary logging
log_bin = /var/log/mysql/mysql-bin.log
binlog_format = ROW
expire_logs_days = 7
max_binlog_size = 100M
server_id = 1
# Sync to disk on each commit for durability
sync_binlog = 1
innodb_flush_log_at_trx_commit = 1
# Step 1: Restore the last full backup
gunzip -c /backups/mysql/full-20260520-020000.sql.gz | mysql --user=root --password
# Step 2: Apply binary logs up to the failure time
# For example, restore up to 14:30:00 on May 20, 2026
mysqlbinlog \
--start-datetime="2026-05-20 02:00:00" \
--stop-datetime="2026-05-20 14:30:00" \
/var/log/mysql/mysql-bin.000001 \
/var/log/mysql/mysql-bin.000002 \
| mysql --user=root --password
# Step 3: Verify data is at the expected point
mysql --user=root --password -e "SELECT COUNT(*) FROM myapp.orders WHERE created_at < '2026-05-20 14:30:00';"
PostgreSQL Backup Strategies
PostgreSQL provides pg_dump for logical backups and WAL (Write-Ahead Log) archiving for continuous point-in-time recovery. For large databases, pg_basebackup creates a physical copy with minimal overhead.
Best for smaller databases and selective table restores:
# Full database dump in custom format (faster restore)
pg_dump \
--username=postgres \
--format=custom \
--compress=9 \
--file=/backups/pg/myapp-$(date +%Y%m%d-%H%M%S).dump \
myapp_production
# Restore from custom format dump
pg_restore \
--username=postgres \
--dbname=myapp_restored \
--jobs=4 \
--verbose \
/backups/pg/myapp-20260520-020000.dump
# Dump specific tables only
pg_dump \
--username=postgres \
--table=users \
--table=orders \
--format=custom \
myapp_production > /backups/pg/critical-tables.dump
Streaming physical backup — zero downtime, suitable for large databases:
# Create a physical base backup with WAL streaming
pg_basebackup \
--host=localhost \
--username=replication_user \
--pgdata=/backups/pg/basebackup-$(date +%Y%m%d) \
--format=tar \
--compress=9 \
--wal-method=stream \
--progress \
--verbose
# This creates:
# /backups/pg/basebackup-20260520/base.tar.gz (data files)
# /backups/pg/basebackup-20260520/pg_wal.tar.gz (WAL files)
Configure continuous WAL archiving in postgresql.conf for PITR:
# postgresql.conf settings for WAL archiving
wal_level = replica
archive_mode = on
archive_command = 'aws s3 cp %p s3://my-pg-backups/wal/%f'
archive_timeout = 300 # Archive every 5 minutes at minimum
# To restore to a specific time, create recovery.conf:
# (or recovery parameters in postgresql.conf for PG 12+)
restore_command = 'aws s3 cp s3://my-pg-backups/wal/%f %p'
recovery_target_time = '2026-05-20 14:30:00'
recovery_target_action = 'promote'
MongoDB Backup Strategies
MongoDB offers mongodump for logical backups and filesystem snapshots for physical backups. MongoDB Atlas provides automated continuous backups with point-in-time recovery built in.
# Full database backup
mongodump \
--uri="mongodb://backup_user:password@localhost:27017" \
--authenticationDatabase=admin \
--out=/backups/mongo/dump-$(date +%Y%m%d-%H%M%S) \
--gzip
# Backup specific database
mongodump \
--uri="mongodb://localhost:27017/myapp" \
--out=/backups/mongo/ \
--gzip \
--archive=/backups/mongo/myapp-$(date +%Y%m%d).archive
# Restore from backup
mongorestore \
--uri="mongodb://localhost:27017" \
--authenticationDatabase=admin \
--gzip \
--archive=/backups/mongo/myapp-20260520.archive \
--nsFrom="myapp.*" \
--nsTo="myapp_restored.*"
# Verify restore
mongo myapp_restored --eval "db.stats()"
mongo myapp_restored --eval "db.users.count()"
MongoDB Oplog for Point-in-Time Recovery
In a replica set, the oplog (operations log) records every write operation and enables PITR. Back it up alongside your data:
# Backup with oplog (enables point-in-time restore)
mongodump \
--uri="mongodb://localhost:27017" \
--oplog \
--gzip \
--archive=/backups/mongo/full-with-oplog-$(date +%Y%m%d).archive
# Replay oplog to a specific point in time
mongorestore \
--oplogReplay \
--oplogLimit "1748000000:1" \
--gzip \
--archive=/backups/mongo/full-with-oplog-20260520.archive
Automating Backups with Shell Scripts
Manual backups are backups that don't exist. Automate your backup process with a shell script and cron to ensure consistency.
#!/bin/bash
set -euo pipefail
# Configuration
BACKUP_DIR="/backups/mysql"
S3_BUCKET="s3://my-db-backups/mysql"
DB_USER="backup_user"
DB_PASS_FILE="/etc/mysql/backup-password"
RETENTION_DAYS=30
DATE=$(date +%Y%m%d-%H%M%S)
LOG_FILE="/var/log/db-backup.log"
log() { echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE"; }
log "Starting backup..."
# Create backup directory if missing
mkdir -p "$BACKUP_DIR"
# Perform mysqldump
BACKUP_FILE="$BACKUP_DIR/full-$DATE.sql.gz"
DB_PASS=$(cat "$DB_PASS_FILE")
mysqldump \
--user="$DB_USER" \
--password="$DB_PASS" \
--all-databases \
--single-transaction \
--routines \
--triggers \
--flush-logs \
| gzip > "$BACKUP_FILE"
log "Backup created: $BACKUP_FILE ($(du -sh "$BACKUP_FILE" | cut -f1))"
# Verify backup integrity
if gunzip -t "$BACKUP_FILE" 2>/dev/null; then
log "Integrity check PASSED"
else
log "ERROR: Integrity check FAILED — aborting upload"
exit 1
fi
# Upload to S3
aws s3 cp "$BACKUP_FILE" "$S3_BUCKET/$(basename "$BACKUP_FILE")" \
--storage-class STANDARD_IA
log "Uploaded to S3: $S3_BUCKET"
# Remove local backups older than retention period
find "$BACKUP_DIR" -name "*.sql.gz" -mtime +"$RETENTION_DAYS" -delete
log "Cleaned up backups older than $RETENTION_DAYS days"
log "Backup complete!"
Make the Script Executable
Set the correct permissions and create a dedicated backup user with minimal privileges.
# Make script executable
chmod +x /usr/local/bin/db-backup.sh
# Create MySQL backup user with minimal privileges
mysql -u root -p < /etc/mysql/backup-password
chmod 600 /etc/mysql/backup-password
Schedule with Cron
Run the script automatically at your desired intervals using crontab.
# Full backup every Sunday at 2:00 AM
0 2 * * 0 /usr/local/bin/db-backup.sh >> /var/log/db-backup.log 2>&1
# Differential backup every day Mon-Sat at 2:00 AM
0 2 * * 1-6 /usr/local/bin/db-differential-backup.sh >> /var/log/db-backup.log 2>&1
# Binary log flush every hour (for PITR)
0 * * * * mysql -u backup_user -p$(cat /etc/mysql/backup-password) -e "FLUSH BINARY LOGS;" >> /var/log/db-backup.log 2>&1
# Alert if no backup file was created in the last 26 hours
0 4 * * * find /backups/mysql -name "*.sql.gz" -mmin -1560 | grep -q . || echo "BACKUP MISSING!" | mail -s "DB Backup Alert" admin@yourcompany.com
Cloud Storage and Retention Policies
Storing backups on the same server as your database defeats the purpose — one hardware failure or ransomware attack wipes both. Cloud object storage (AWS S3, Google Cloud Storage, Backblaze B2) provides cheap, durable, geographically redundant storage.
| Storage Class | Use Case | Retrieval Time | Approx. Cost/GB/mo |
|---|---|---|---|
| S3 Standard | Recent backups (last 7 days) | Milliseconds | $0.023 |
| S3 Standard-IA | Monthly backups (8–30 days) | Milliseconds | $0.0125 |
| S3 Glacier Instant | Quarterly backups (31–90 days) | Milliseconds | $0.004 |
| S3 Glacier Deep Archive | Long-term archive (90+ days) | Up to 12 hours | $0.00099 |
{
"Rules": [
{
"ID": "db-backup-lifecycle",
"Status": "Enabled",
"Filter": { "Prefix": "mysql/" },
"Transitions": [
{
"Days": 7,
"StorageClass": "STANDARD_IA"
},
{
"Days": 30,
"StorageClass": "GLACIER_INSTANT_RETRIEVAL"
},
{
"Days": 90,
"StorageClass": "DEEP_ARCHIVE"
}
],
"Expiration": {
"Days": 365
}
}
]
}
Backups stored in the cloud should always be encrypted. Use AWS S3 server-side encryption (SSE-S3 or SSE-KMS) or encrypt before uploading with openssl enc -aes-256-cbc. A stolen backup file without encryption is a full data breach.
Testing Your Backups (The Most Skipped Step)
An untested backup is not a backup — it's just hope. A backup file that appears complete may contain corruption, an incomplete dump, or be from the wrong schema version. The only way to know your backups work is to restore from them regularly.
Automated Restore Testing Script
#!/bin/bash
# Weekly automated restore test — run in isolated environment
set -euo pipefail
BACKUP_DIR="/backups/mysql"
TEST_DB="restore_test_$(date +%Y%m%d)"
LATEST_BACKUP=$(ls -t "$BACKUP_DIR"/*.sql.gz | head -1)
ALERT_EMAIL="ops@yourcompany.com"
log() { echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1"; }
log "Testing restore from: $LATEST_BACKUP"
# Create isolated test database
mysql -u root -p"$ROOT_PASS" -e "CREATE DATABASE $TEST_DB;"
# Attempt restore
if gunzip -c "$LATEST_BACKUP" | mysql -u root -p"$ROOT_PASS" "$TEST_DB"; then
log "Restore succeeded"
# Validate row counts match production
PROD_COUNT=$(mysql -u root -p"$ROOT_PASS" myapp_production \
-se "SELECT COUNT(*) FROM users;")
TEST_COUNT=$(mysql -u root -p"$ROOT_PASS" "$TEST_DB" \
-se "SELECT COUNT(*) FROM users;")
if [ "$PROD_COUNT" == "$TEST_COUNT" ]; then
log "Validation PASSED: user count matches ($PROD_COUNT rows)"
else
log "WARN: Row count mismatch — prod=$PROD_COUNT test=$TEST_COUNT"
echo "Row count mismatch in restore test" | mail -s "Backup Warning" "$ALERT_EMAIL"
fi
else
log "ERROR: Restore FAILED"
echo "Restore test failed for $LATEST_BACKUP" | mail -s "CRITICAL: Backup Restore Failed" "$ALERT_EMAIL"
fi
# Cleanup
mysql -u root -p"$ROOT_PASS" -e "DROP DATABASE $TEST_DB;"
log "Test database cleaned up"
Run this test in a separate environment — never restore into production for testing. A monthly scheduled job on a test server that restores the latest backup and validates row counts gives you confidence your backups are usable.
RTO, RPO, and Disaster Recovery Planning
Before designing your backup strategy, define your recovery targets. These two metrics drive every architectural decision:
- Recovery Time Objective (RTO): The maximum acceptable downtime. "We can be down for at most 4 hours." A lower RTO requires more infrastructure (hot standbys, pre-staged restore environments).
- Recovery Point Objective (RPO): The maximum acceptable data loss measured in time. "We can lose at most 1 hour of data." A lower RPO requires more frequent backups and transaction log shipping.
| Business Tier | Target RTO | Target RPO | Strategy |
|---|---|---|---|
| Mission Critical | < 15 min | < 1 min | Hot standby replica, synchronous replication |
| Business Critical | < 1 hour | < 1 hour | Warm standby, hourly transaction log backups |
| Standard | < 4 hours | < 4 hours | Daily full + automated restore scripts |
| Development | < 24 hours | < 24 hours | Daily full backup to S3 |
Documenting Your Runbook
A backup without a documented restore procedure is a liability. Your runbook should answer:
- Where are backup files stored, and how do you access them (credentials, VPN)?
- What is the exact sequence of commands to restore each database system?
- How do you validate the restore was successful?
- Who gets paged and in what order? What is the escalation path?
- When is the application safe to cut traffic back over?
Your runbook must be accessible when your primary systems are down. Store it in a static location (printed copy, separate wiki, Google Doc) — not solely on the same server that just failed. Run a "chaos drill" annually: simulate a failure and time how long it takes to execute your runbook from scratch.
Backup Monitoring and Alerting
Backups fail silently. Without monitoring, you may not discover a broken backup job until you desperately need to restore — the worst possible time. Instrument your backup pipeline with alerts.
What to Monitor
- Backup job completion: Alert if a scheduled backup didn't complete within the expected window
- Backup file size anomaly: A backup that's 80% smaller than yesterday's may indicate truncation or a missed schema
- S3 upload success: Verify the remote copy exists — local write succeeding doesn't mean the upload did
- Disk space: Alert when backup destination reaches 80% capacity before it fills and stops writing
- Restore test results: Weekly automated restore tests should report pass/fail to your monitoring system
- Backup age: Alert if the newest backup is older than your RPO — someone may have disabled the cron job
"You don't have backups. You have restores. A backup is only a backup if you can restore from it."
— Common Ops Wisdom
Database backup strategy is not a one-time task — it evolves as your data grows, your RPO/RTO requirements tighten, and new failure modes emerge. Start with a simple weekly full backup to S3, then layer in daily differentials and hourly transaction log shipping as your business demands it. Most importantly: test your restores, document your runbook, and treat the first successful restore drill as the real completion of your backup strategy.