[AWS] Amazon S3

silver

Nov 18, 2025

Contents

S3 기본 개념 S3 보안 S3 웹사이트 호스팅 S3 복제 (Replication)S3 스토리지 클래스 S3 성능 최적화 S3 Select & Glacier Select 실전 활용 사례 비용 최적화 전략

Amazon S3(Simple Storage Service)는 객체 스토리지 서비스 - 무제한 확장성, 99.999999999%(11 9s)의 내구성, 다양한 활용 사례

S3 기본 개념

객체 스토리지

S3는 파일을 "객체(Object)"로 저장


전통적인 파일 시스템:
/home/user/documents/report.pdf

S3 객체 스토리지:
버킷: my-documents
키: user/documents/report.pdf
→ 실제로는 플랫한 구조 (디렉토리 개념 없음)

주요 용어

버킷(Bucket)

객체를 저장하는 최상위 컨테이너

전 세계적으로 고유한 이름 필요

리전별로 생성

계정당 기본 100개 제한 (확장 가능)

객체(Object)

실제 데이터 + 메타데이터

최대 5TB 크기

키(Key)로 식별

키(Key)

버킷 내 객체의 고유 식별자

예: 2024/11/photos/vacation.jpg

버킷 명명 규칙

✅ 올바른 버킷 이름:


my-app-bucket-2024
user-photos-prod
company-data-backup

❌ 잘못된 버킷 이름:


MyAppBucket          (대문자 불가)
my_app_bucket        (언더스코어 불가)
192.168.1.1          (IP 주소 형식 불가)
my..bucket           (연속된 점 불가)
bucket-             (하이픈으로 시작/끝 불가)

규칙:

3-63자 길이

소문자, 숫자, 하이픈만 사용

문자나 숫자로 시작

IP 주소 형식 불가

S3 버킷 생성


# AWS CLI로 버킷 생성
aws s3 mb s3://my-unique-bucket-20241119 --region ap-northeast-2

# 버킷 목록 조회
aws s3 ls

# 특정 버킷 내용 조회
aws s3 ls s3://my-unique-bucket-20241119/

S3 객체 업로드


# 파일 업로드
aws s3 cp myfile.txt s3://my-unique-bucket-20241119/

# 폴더 업로드 (재귀적)
aws s3 cp ./photos s3://my-unique-bucket-20241119/photos/ --recursive

# 메타데이터와 함께 업로드
aws s3 cp myfile.txt s3://my-unique-bucket-20241119/ \
  --metadata author=john,date=2024-11-19 \
  --content-type text/plain

S3 객체 다운로드


# 파일 다운로드
aws s3 cp s3://my-unique-bucket-20241119/myfile.txt ./

# 폴더 다운로드
aws s3 cp s3://my-unique-bucket-20241119/photos/ ./photos/ --recursive

# 동기화 (변경된 파일만)
aws s3 sync s3://my-unique-bucket-20241119/photos/ ./photos/

S3 객체 삭제


# 단일 객체 삭제
aws s3 rm s3://my-unique-bucket-20241119/myfile.txt

# 폴더 내 모든 객체 삭제
aws s3 rm s3://my-unique-bucket-20241119/photos/ --recursive

# 버킷 삭제 (비어있어야 함)
aws s3 rb s3://my-unique-bucket-20241119

# 버킷과 모든 내용 강제 삭제
aws s3 rb s3://my-unique-bucket-20241119 --force

S3 보안

1. 버킷 정책 (Bucket Policy)

버킷 수준의 접근 제어를 JSON으로 정의

퍼블릭 읽기 허용:


{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "PublicReadGetObject",
      "Effect": "Allow",
      "Principal": "*",
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::my-public-bucket/*"
    }
  ]
}

특정 IP만 허용:


{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": "*",
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::my-bucket/*",
      "Condition": {
        "IpAddress": {
          "aws:SourceIp": [
            "203.0.113.0/24",
            "198.51.100.0/24"
          ]
        }
      }
    }
  ]
}

특정 AWS 계정만 허용:


{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::123456789012:root"
      },
      "Action": [
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Resource": "arn:aws:s3:::my-shared-bucket/*"
    }
  ]
}

VPC 엔드포인트에서만 접근:


{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Deny",
      "Principal": "*",
      "Action": "s3:*",
      "Resource": [
        "arn:aws:s3:::my-private-bucket",
        "arn:aws:s3:::my-private-bucket/*"
      ],
      "Condition": {
        "StringNotEquals": {
          "aws:SourceVpce": "vpce-1234567890abcdef0"
        }
      }
    }
  ]
}

2. IAM 정책

사용자/역할 수준의 접근 제어


{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:ListBucket"
      ],
      "Resource": "arn:aws:s3:::my-bucket"
    },
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:DeleteObject"
      ],
      "Resource": "arn:aws:s3:::my-bucket/user/${aws:username}/*"
    }
  ]
}

이 정책은 사용자가 자신의 폴더(/user/john/)만 접근하도록 제한

3. ACL (Access Control List) - 레거시

객체 수준의 접근 제어이지만, 버킷 정책이 권장


# 객체를 퍼블릭으로 설정
aws s3api put-object-acl \
  --bucket my-bucket \
  --key myfile.txt \
  --acl public-read

⚠️ 권장하지 않음: 대신 버킷 정책 사용

4. 퍼블릭 액세스 차단

기본적으로 모든 퍼블릭 액세스 차단 (권장):


aws s3api put-public-access-block \
  --bucket my-bucket \
  --public-access-block-configuration \
    BlockPublicAcls=true,\
    IgnorePublicAcls=true,\
    BlockPublicPolicy=true,\
    RestrictPublicBuckets=true

5. 암호화

서버 측 암호화 (SSE)

SSE-S3 (기본):


# 업로드 시 암호화
aws s3 cp myfile.txt s3://my-bucket/ \
  --server-side-encryption AES256

SSE-KMS (키 관리):


aws s3 cp myfile.txt s3://my-bucket/ \
  --server-side-encryption aws:kms \
  --ssekms-key-id arn:aws:kms:region:account-id:key/key-id

SSE-C (고객 제공 키):


aws s3api put-object \
  --bucket my-bucket \
  --key myfile.txt \
  --body myfile.txt \
  --sse-customer-algorithm AES256 \
  --sse-customer-key base64-encoded-key

클라이언트 측 암호화

애플리케이션에서 암호화 후 업로드:


from cryptography.fernet import Fernet
import boto3

# 키 생성
key = Fernet.generate_key()
cipher = Fernet(key)

# 파일 암호화
with open('myfile.txt', 'rb') as f:
    data = f.read()
    encrypted = cipher.encrypt(data)

# S3 업로드
s3 = boto3.client('s3')
s3.put_object(Bucket='my-bucket', Key='encrypted-file', Body=encrypted)

6. 버전 관리 (Versioning)

파일의 모든 버전을 보존


# 버전 관리 활성화
aws s3api put-bucket-versioning \
  --bucket my-bucket \
  --versioning-configuration Status=Enabled

# 파일 업로드 (여러 번)
aws s3 cp myfile.txt s3://my-bucket/  # 버전 1
aws s3 cp myfile.txt s3://my-bucket/  # 버전 2
aws s3 cp myfile.txt s3://my-bucket/  # 버전 3

# 모든 버전 조회
aws s3api list-object-versions --bucket my-bucket

# 특정 버전 다운로드
aws s3api get-object \
  --bucket my-bucket \
  --key myfile.txt \
  --version-id <version-id> \
  myfile-v1.txt

삭제 마커:


# 객체 삭제 (실제로는 삭제 마커만 추가)
aws s3 rm s3://my-bucket/myfile.txt

# 객체는 여전히 존재 (삭제 마커 뒤에 숨음)
# 영구 삭제하려면 버전 ID 지정
aws s3api delete-object \
  --bucket my-bucket \
  --key myfile.txt \
  --version-id <version-id>

S3 웹사이트 호스팅

정적 웹사이트 구성


# 웹사이트 호스팅 활성화
aws s3 website s3://my-website-bucket/ \
  --index-document index.html \
  --error-document error.html

버킷 정책 (퍼블릭 읽기):


{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": "*",
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::my-website-bucket/*"
    }
  ]
}

웹사이트 엔드포인트:


http://my-website-bucket.s3-website.ap-northeast-2.amazonaws.com

리다이렉트 규칙


<RoutingRules>
  <RoutingRule>
    <Condition>
      <KeyPrefixEquals>old-page/</KeyPrefixEquals>
    </Condition>
    <Redirect>
      <ReplaceKeyPrefixWith>new-page/</ReplaceKeyPrefixWith>
    </Redirect>
  </RoutingRule>
</RoutingRules>

CloudFront와 함께 사용 (권장)


사용자
    ↓
[CloudFront] (CDN)
    ↓ Origin
[S3 Static Website]

장점:

HTTPS 지원

커스텀 도메인

글로벌 배포

캐싱으로 성능 향상

S3 복제 (Replication)

Cross-Region Replication (CRR)

다른 리전으로 자동 복제


Source Bucket (ap-northeast-2)
    ↓ 복제
Destination Bucket (us-east-1)

사용 사례:

재해 복구

규정 준수

지연 시간 최소화

설정:


{
  "Role": "arn:aws:iam::account-id:role/replication-role",
  "Rules": [
    {
      "Status": "Enabled",
      "Priority": 1,
      "Filter": {
        "Prefix": "documents/"
      },
      "Destination": {
        "Bucket": "arn:aws:s3:::destination-bucket",
        "ReplicationTime": {
          "Status": "Enabled",
          "Time": {
            "Minutes": 15
          }
        }
      }
    }
  ]
}

Same-Region Replication (SRR)

동일 리전 내 다른 버킷으로 복제

사용 사례:

로그 집계

프로덕션/테스트 데이터 동기화

규정 준수

요구사항:

소스와 대상 버킷 모두 버전 관리 활성화

IAM 역할 필요

기존 객체는 복제 안 됨 (새 객체만)

S3 스토리지 클래스

1. S3 Standard (기본)


사용 사례: 자주 접근하는 데이터
접근 빈도: 매우 높음
지연 시간: 밀리초
비용: 가장 높음 (~$0.023/GB/월)

예시:

웹사이트 콘텐츠

모바일 앱 데이터

게임 에셋

2. S3 Intelligent-Tiering

자동으로 최적의 스토리지 클래스로 이동


30일 미접근 → Infrequent Access
90일 미접근 → Archive Instant Access
180일 미접근 → Archive Access (선택적)

특징:

모니터링 비용: $0.0025/1,000 객체

검색 비용 없음

접근 패턴 불분명할 때 유용

3. S3 Standard-IA (Infrequent Access)


사용 사례: 자주 접근하지 않지만 빠른 접근 필요
접근 빈도: 월 1-2회
지연 시간: 밀리초
비용: ~$0.0125/GB/월 (50% 저렴)
검색 비용: $0.01/GB

예시:

백업

재해 복구 파일

오래된 미디어

최소 요구사항:

최소 저장 기간: 30일

최소 객체 크기: 128KB

4. S3 One Zone-IA

단일 AZ에만 저장합니다.


비용: ~$0.01/GB/월 (Standard-IA보다 20% 저렴)
내구성: 99.5% (단일 AZ 손실 시 데이터 손실)

사용 사례:

재생성 가능한 데이터

썸네일 이미지

중요하지 않은 로그

5. S3 Glacier Instant Retrieval


사용 사례: 분기당 1회 접근하는 아카이브
접근 빈도: 분기당 1회
지연 시간: 밀리초
비용: ~$0.004/GB/월 (80% 저렴)
검색 비용: $0.03/GB

최소 저장 기간: 90일

6. S3 Glacier Flexible Retrieval


사용 사례: 연 1-2회 접근하는 아카이브
접근 빈도: 연간
지연 시간:
  - Expedited: 1-5분 ($0.03/GB)
  - Standard: 3-5시간 ($0.01/GB)
  - Bulk: 5-12시간 ($0.0025/GB)
비용: ~$0.0036/GB/월

예시:

법적 기록 보관

미디어 아카이브

7. S3 Glacier Deep Archive


사용 사례: 장기 아카이브 (7-10년)
접근 빈도: 거의 없음
지연 시간:
  - Standard: 12시간
  - Bulk: 48시간
비용: ~$0.00099/GB/월 (가장 저렴)

최소 저장 기간: 180일

예시:

규정 준수 아카이브

의료 기록

영구 백업

스토리지 클래스 비교표

클래스	비용/GB	검색 비용	지연 시간	사용 사례
Standard	$0.023	없음	ms	일반 데이터
Intelligent-Tiering	변동	없음	ms	알 수 없는 패턴
Standard-IA	$0.0125	$0.01	ms	월 1-2회 접근
One Zone-IA	$0.01	$0.01	ms	재생성 가능
Glacier Instant	$0.004	$0.03	ms	분기별 접근
Glacier Flexible	$0.0036	$0.01	분~시간	연간 접근
Glacier Deep	$0.00099	$0.02	12-48시간	장기 보관

라이프사이클 정책

자동으로 객체를 다른 스토리지 클래스로 이동하거나 삭제


{
  "Rules": [
    {
      "Id": "Move to IA after 30 days",
      "Status": "Enabled",
      "Transitions": [
        {
          "Days": 30,
          "StorageClass": "STANDARD_IA"
        },
        {
          "Days": 90,
          "StorageClass": "GLACIER_IR"
        },
        {
          "Days": 365,
          "StorageClass": "DEEP_ARCHIVE"
        }
      ],
      "Expiration": {
        "Days": 2555
      }
    }
  ]
}

시나리오 예시:


1일차: 업로드 → Standard
30일 후: 자동 → Standard-IA
90일 후: 자동 → Glacier Instant Retrieval
365일 후: 자동 → Glacier Deep Archive
2555일 후(7년): 자동 삭제


# CLI로 라이프사이클 정책 적용
aws s3api put-bucket-lifecycle-configuration \
  --bucket my-bucket \
  --lifecycle-configuration file://lifecycle.json

S3 성능 최적화

1. Multipart Upload

큰 파일을 여러 부분으로 나눠 병렬 업로드


100MB 파일 → [20MB] [20MB] [20MB] [20MB] [20MB]
                ↓      ↓      ↓      ↓      ↓
              동시 업로드 → 5배 빠름

권장:

100MB 이상: Multipart Upload 권장

5GB 이상: Multipart Upload 필수


import boto3
from boto3.s3.transfer import TransferConfig

s3 = boto3.client('s3')

# Multipart 설정
config = TransferConfig(
    multipart_threshold=1024 * 25,  # 25MB
    max_concurrency=10,
    multipart_chunksize=1024 * 25,
    use_threads=True
)

# 업로드
s3.upload_file(
    'large-file.zip',
    'my-bucket',
    'large-file.zip',
    Config=config
)

2. Transfer Acceleration

CloudFront 엣지 로케이션을 통해 업로드 속도를 향상


일반 업로드:
Client (Seoul) → S3 (Virginia)
↓ 느림 (공인 인터넷)

Transfer Acceleration:
Client (Seoul) → CloudFront Edge (Seoul) → AWS 네트워크 → S3 (Virginia)
↓ 빠름 (AWS 백본)


# Transfer Acceleration 활성화
aws s3api put-bucket-accelerate-configuration \
  --bucket my-bucket \
  --accelerate-configuration Status=Enabled

# 업로드 (특별한 엔드포인트 사용)
aws s3 cp large-file.zip \
  s3://my-bucket/ \
  --endpoint-url https://my-bucket.s3-accelerate.amazonaws.com

비용: 일반 전송 대비 추가 비용 발생

3. 프리픽스 최적화

S3는 프리픽스당 초당 3,500 PUT/COPY/POST/DELETE 및 5,500 GET/HEAD 요청을 지원.


❌ 비효율:
/data/file1.jpg
/data/file2.jpg
/data/file3.jpg
→ 단일 프리픽스: 5,500 GET/s

✅ 효율:
/data/2024-11-19-001/file1.jpg
/data/2024-11-19-002/file2.jpg
/data/2024-11-19-003/file3.jpg
→ 3개 프리픽스: 16,500 GET/s

4. Byte-Range Fetches

파일의 특정 부분만 다운로드


# 처음 1MB만 다운로드
response = s3.get_object(
    Bucket='my-bucket',
    Key='large-file.bin',
    Range='bytes=0-1048575'
)

# 병렬 다운로드 (멀티파트)
def download_chunk(start, end):
    response = s3.get_object(
        Bucket='my-bucket',
        Key='large-file.bin',
        Range=f'bytes={start}-{end}'
    )
    return response['Body'].read()

# 10MB씩 병렬 다운로드
chunks = [
    download_chunk(0, 10*1024*1024),
    download_chunk(10*1024*1024, 20*1024*1024),
    # ...
]

S3 Select & Glacier Select

SQL로 S3 객체 내부 데이터를 쿼리


import boto3

s3 = boto3.client('s3')

# CSV 파일에서 특정 열만 조회
response = s3.select_object_content(
    Bucket='my-bucket',
    Key='data.csv',
    ExpressionType='SQL',
    Expression='SELECT name, age FROM S3Object WHERE age > 30',
    InputSerialization={
        'CSV': {'FileHeaderInfo': 'USE'}
    },
    OutputSerialization={
        'CSV': {}
    }
)

# 결과 처리
for event in response['Payload']:
    if 'Records' in event:
        print(event['Records']['Payload'].decode())

장점:

전체 파일 다운로드 불필요

비용 절감 (스캔한 데이터만 과금)

속도 향상 (최대 400% 빠름)

지원 형식:

JSON

Parquet (Apache Parquet)

실전 활용 사례

1. 로그 저장 및 분석


Application → [CloudWatch Logs]
                   ↓ Export
              [S3 Bucket: logs/]
                   ↓ Lifecycle
              Standard → IA (30일) → Glacier (90일)
                   ↓
              [Athena로 쿼리]

2. 백업 및 아카이브


[On-Premise Database]
    ↓ Daily Backup
[S3 Standard]
    ↓ 30일 후
[S3 Standard-IA]
    ↓ 90일 후
[S3 Glacier Deep Archive]
    ↓ 7년 후
자동 삭제

3. 정적 웹사이트


[S3 Static Website]
    ↓ Origin
[CloudFront]
    ↓ Custom Domain
[Route 53]
→ https://www.example.com

4. 데이터 레이크


[다양한 데이터 소스]
    ↓
[S3 Data Lake]
    ↓
[AWS Glue] ← 데이터 카탈로그
    ↓
[Athena / Redshift Spectrum] ← SQL 쿼리
    ↓
[QuickSight] ← 시각화

비용 최적화 전략

1. 스토리지 클래스 분석


# S3 Analytics 활성화
aws s3api put-bucket-analytics-configuration \
  --bucket my-bucket \
  --id storage-class-analysis \
  --analytics-configuration file://config.json

S3가 접근 패턴을 분석하여 권장 스토리지 클래스를 제시

2. Intelligent-Tiering 사용

접근 패턴이 불분명한 경우 자동으로 최적화

3. 불완전한 Multipart Upload 정리


{
  "Rules": [
    {
      "Id": "Delete incomplete multipart uploads",
      "Status": "Enabled",
      "AbortIncompleteMultipartUpload": {
        "DaysAfterInitiation": 7
      }
    }
  ]
}

4. 이전 버전 삭제


{
  "Rules": [
    {
      "Id": "Delete old versions",
      "Status": "Enabled",
      "NoncurrentVersionExpiration": {
        "NoncurrentDays": 30
      }
    }
  ]
}

5. Requester Pays

데이터 전송 비용을 요청자가 부담


aws s3api put-bucket-request-payment \
  --bucket my-public-dataset \
  --request-payment-configuration Payer=Requester

💡

Amazon S3는 단순한 스토리지를 넘어 다양한 기능을 제공하는 종합 데이터 플랫폼

무제한 확장성과 11 9s 내구성

버킷 정책과 IAM으로 세밀한 접근 제어

버전 관리로 데이터 보호

7가지 스토리지 클래스로 비용 최적화

라이프사이클 정책으로 자동 관리

Multipart Upload와 Transfer Acceleration으로 성능 향상