1️⃣ What is Object Storage?

Object storage is a storage system designed for storing large files (BLOBs).

BLOB = Binary Large Object

Examples:

  • Photos 📷
  • Videos 🎥
  • Music files 🎵
  • PDFs
  • JSON files
  • Large logs
  • ML datasets

These files are basically large collections of bytes.


2️⃣ Why NOT Store Large Files in Traditional Databases?

Traditional relational databases like:

  • PostgreSQL
  • MySQL

are optimized for:

  • Small records
  • Frequent updates
  • Rich queries
  • Transactions
  • Joins

NOT for huge static files.


❌ Problems of Storing Files in DB

Example

Suppose:

  • User profile image = 4 MB
  • PostgreSQL page size = 8 KB

Then:

4 MB image ≈ 500 DB pages

Problems Created

1. Slow Queries

Even simple query:

SELECT * FROM users LIMIT 50;

becomes slower because DB handles huge blobs.

Effects:

  • Increased memory pressure
  • Higher disk I/O
  • Slower reads

2. Replication Cost

DB replicas must copy huge blobs.

Effects:

  • High bandwidth usage
  • Replication lag
  • Expensive scaling

3. Backup & Restore Problems

Backups become huge because they include images/videos.

Effects:

  • Slow backup
  • Very slow disaster recovery

3️⃣ Core Idea

✅ Store:

Data TypeStorage
MetadataTraditional DB
Large FilesObject Storage

4️⃣ What is Stored in Metadata DB?

Metadata means small queryable information.

Example:

FieldStored In
Post IDDB
User IDDB
Caption/TextDB
Image URLDB
Actual ImageObject Storage

5️⃣ High Level Architecture

Client
   |
   v
Metadata Service
   |
   v
Object Storage Nodes

Read Flow

Step 1

Client asks metadata service:

"Where is file1?"

Step 2

Metadata service checks index.

file1 -> Server A

Step 3

Client directly downloads/streams file from storage node.


6️⃣ Why Object Storage is Cheap & Durable?

A. Flat Namespace

Unlike traditional file systems:

/folder1/folder2/file.jpg

Object stores internally use:

single unique key

Example:

user123/profile/image1.jpg

This is just a string key.

Benefits:

  • Faster lookup
  • Simpler architecture
  • Easy scaling

B. Immutable Writes

Files are usually immutable.

You:

  • Cannot modify bytes in middle
  • Usually overwrite entire file
  • Or create new version

Benefits:

  • No locks
  • No race conditions
  • Simpler distributed systems
  • Better scalability

C. Redundancy & Replication

Each object is stored across:

  • Multiple servers
  • Multiple racks
  • Sometimes multiple data centers

Techniques:

  • Replication
  • Erasure Coding

Durability

Typical durability:

99.999999999%

Called:

11 nines durability

Meaning:

  • Losing few servers won’t lose data.

7️⃣ Pre-Signed URLs (VERY IMPORTANT)

Frequently asked in interviews ⚠️


❌ Bad Approach

Client -> Backend -> S3

Problems:

  • Backend bandwidth waste
  • Backend bottleneck
  • Large file handling issues

✅ Correct Industry Approach

Client -> Backend -> Get PreSigned URL
Client -> Direct Upload to S3

Upload Flow

Step 1

Client asks backend:

"I want to upload image.png"

Step 2

Backend generates pre-signed URL from S3.

URL contains:

  • Temporary permission
  • Expiry time
  • Allowed operation

Step 3

Client uploads directly to S3.

Benefits:

  • Less backend load
  • Better scalability
  • Faster uploads

Download Flow

Same idea:

Client -> Directly download from S3

instead of:

Client -> Backend -> File

8️⃣ Multipart Upload

Used for huge files.

Also called:

Multi-part upload

Why Needed?

HTTP upload size limits exist:

  • Browser limits
  • Gateway limits
  • Server limits

Large file example:

  • 10 GB video

Cannot upload in single request.


Solution

Split file into chunks.

Example:

10 GB
 -> 5 MB chunks
 -> Upload chunks separately

Then object storage stitches chunks together.


Benefits

  • Resume failed uploads
  • Parallel uploads
  • Faster uploads
  • Reliable large file transfer

9️⃣ Common Use Cases

Use CaseWhy Object Storage
Social media photosLarge static files
Video streamingHuge media files
Dropbox/DriveFile storage
ML datasetsMassive training data
LogsCheap long-term storage
Static website assetsCSS/JS/images

🔟 Popular Object Storage Systems

ProviderService
AWSAmazon S3
GCPGoogle Cloud Storage
AzureAzure Blob Storage

1️⃣1️⃣ Interview Notes (IMPORTANT)

Always Mention

✅ Store metadata separately

Metadata -> SQL/NoSQL DB
Files -> Object Storage

✅ Use Pre-Signed URLs

Avoid routing files through backend.


✅ Use Multipart Upload

For large files.


✅ Mention CDN

Usually object storage is fronted by CDN.

Client -> CDN -> Object Storage

Benefits:

  • Lower latency
  • Global caching
  • Reduced origin load

1️⃣2️⃣ Quick Interview Summary

When to Use Object Storage?

Use when:

  • Files are large
  • Mostly static
  • Need cheap storage
  • Need high durability

Examples:

  • Images
  • Videos
  • Documents
  • Logs
  • ML data

1️⃣3️⃣ One-Line Interview Answer

“Large binary files should be stored in object storage like S3, while metadata remains in traditional databases. Clients typically upload/download directly using pre-signed URLs, and large files use multipart uploads.”