1️⃣ What is Object Storage?
Object storage is a storage system designed for storing large files (BLOBs).
BLOB = Binary Large Object
Examples:
- Photos 📷
- Videos 🎥
- Music files 🎵
- PDFs
- JSON files
- Large logs
- ML datasets
These files are basically large collections of bytes.
2️⃣ Why NOT Store Large Files in Traditional Databases?
Traditional relational databases like:
- PostgreSQL
- MySQL
are optimized for:
- Small records
- Frequent updates
- Rich queries
- Transactions
- Joins
NOT for huge static files.
❌ Problems of Storing Files in DB
Example
Suppose:
- User profile image =
4 MB - PostgreSQL page size =
8 KB
Then:
4 MB image ≈ 500 DB pagesProblems Created
1. Slow Queries
Even simple query:
SELECT * FROM users LIMIT 50;becomes slower because DB handles huge blobs.
Effects:
- Increased memory pressure
- Higher disk I/O
- Slower reads
2. Replication Cost
DB replicas must copy huge blobs.
Effects:
- High bandwidth usage
- Replication lag
- Expensive scaling
3. Backup & Restore Problems
Backups become huge because they include images/videos.
Effects:
- Slow backup
- Very slow disaster recovery
3️⃣ Core Idea
✅ Store:
| Data Type | Storage |
|---|---|
| Metadata | Traditional DB |
| Large Files | Object Storage |
4️⃣ What is Stored in Metadata DB?
Metadata means small queryable information.
Example:
| Field | Stored In |
|---|---|
| Post ID | DB |
| User ID | DB |
| Caption/Text | DB |
| Image URL | DB |
| Actual Image | Object Storage |
5️⃣ High Level Architecture
Client
|
v
Metadata Service
|
v
Object Storage NodesRead Flow
Step 1
Client asks metadata service:
"Where is file1?"Step 2
Metadata service checks index.
file1 -> Server AStep 3
Client directly downloads/streams file from storage node.
6️⃣ Why Object Storage is Cheap & Durable?
A. Flat Namespace
Unlike traditional file systems:
/folder1/folder2/file.jpgObject stores internally use:
single unique keyExample:
user123/profile/image1.jpgThis is just a string key.
Benefits:
- Faster lookup
- Simpler architecture
- Easy scaling
B. Immutable Writes
Files are usually immutable.
You:
- Cannot modify bytes in middle
- Usually overwrite entire file
- Or create new version
Benefits:
- No locks
- No race conditions
- Simpler distributed systems
- Better scalability
C. Redundancy & Replication
Each object is stored across:
- Multiple servers
- Multiple racks
- Sometimes multiple data centers
Techniques:
- Replication
- Erasure Coding
Durability
Typical durability:
99.999999999%Called:
11 nines durabilityMeaning:
- Losing few servers won’t lose data.
7️⃣ Pre-Signed URLs (VERY IMPORTANT)
Frequently asked in interviews ⚠️
❌ Bad Approach
Client -> Backend -> S3Problems:
- Backend bandwidth waste
- Backend bottleneck
- Large file handling issues
✅ Correct Industry Approach
Client -> Backend -> Get PreSigned URL
Client -> Direct Upload to S3Upload Flow
Step 1
Client asks backend:
"I want to upload image.png"Step 2
Backend generates pre-signed URL from S3.
URL contains:
- Temporary permission
- Expiry time
- Allowed operation
Step 3
Client uploads directly to S3.
Benefits:
- Less backend load
- Better scalability
- Faster uploads
Download Flow
Same idea:
Client -> Directly download from S3instead of:
Client -> Backend -> File8️⃣ Multipart Upload
Used for huge files.
Also called:
Multi-part uploadWhy Needed?
HTTP upload size limits exist:
- Browser limits
- Gateway limits
- Server limits
Large file example:
10 GB video
Cannot upload in single request.
Solution
Split file into chunks.
Example:
10 GB
-> 5 MB chunks
-> Upload chunks separatelyThen object storage stitches chunks together.
Benefits
- Resume failed uploads
- Parallel uploads
- Faster uploads
- Reliable large file transfer
9️⃣ Common Use Cases
| Use Case | Why Object Storage |
|---|---|
| Social media photos | Large static files |
| Video streaming | Huge media files |
| Dropbox/Drive | File storage |
| ML datasets | Massive training data |
| Logs | Cheap long-term storage |
| Static website assets | CSS/JS/images |
🔟 Popular Object Storage Systems
| Provider | Service |
|---|---|
| AWS | Amazon S3 |
| GCP | Google Cloud Storage |
| Azure | Azure Blob Storage |
1️⃣1️⃣ Interview Notes (IMPORTANT)
Always Mention
✅ Store metadata separately
Metadata -> SQL/NoSQL DB
Files -> Object Storage✅ Use Pre-Signed URLs
Avoid routing files through backend.
✅ Use Multipart Upload
For large files.
✅ Mention CDN
Usually object storage is fronted by CDN.
Client -> CDN -> Object StorageBenefits:
- Lower latency
- Global caching
- Reduced origin load
1️⃣2️⃣ Quick Interview Summary
When to Use Object Storage?
Use when:
- Files are large
- Mostly static
- Need cheap storage
- Need high durability
Examples:
- Images
- Videos
- Documents
- Logs
- ML data
1️⃣3️⃣ One-Line Interview Answer
“Large binary files should be stored in object storage like S3, while metadata remains in traditional databases. Clients typically upload/download directly using pre-signed URLs, and large files use multipart uploads.”