Content-Type 문자셋
응답의 문자 인코딩을 선언하는 HTTP 헤더 매개변수(Content-Type: text/html; charset=utf-8). 문서 내 인코딩 선언보다 우선합니다.
What Is the Content-Type Charset Parameter?
The Content-Type HTTP response header tells the browser two things: the media type of the response body (like text/html) and — optionally — the character encoding used to encode it. The encoding is specified via the charset parameter:
Content-Type: text/html; charset=UTF-8
Content-Type: text/plain; charset=ISO-8859-1
Content-Type: application/json; charset=UTF-8
Without this parameter, browsers must guess the encoding using heuristics, byte-order marks, or HTML meta tags — a process that can go wrong and produce garbled text (mojibake).
Why It Matters for Unicode
Unicode text is abstract code points. To transmit it over a network, you must encode those code points as bytes. UTF-8 is by far the most common encoding — it can represent every Unicode code point and is backwards-compatible with ASCII. If the server sends UTF-8 bytes but the browser interprets them as ISO-8859-1, multi-byte sequences will be misread.
Example: The string "café" encoded in UTF-8 is 63 61 66 C3 A9. If interpreted as ISO-8859-1:
- C3 → Ã
- A9 → ©
- Result displayed: café — classic mojibake.
Header vs. Meta Tag
For HTML, the encoding can be declared in two places:
<!-- HTML meta tag (in-document declaration) -->
<meta charset="UTF-8">
<!-- or legacy form: -->
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
# HTTP header (sent by server)
Content-Type: text/html; charset=UTF-8
The HTTP header takes precedence over the meta tag when both are present. The meta tag is a fallback for situations where the HTTP header is absent (e.g., opening a local HTML file).
Setting the Charset in Django
# Django sets UTF-8 by default in settings
DEFAULT_CHARSET = "utf-8"
# Response header automatically becomes:
# Content-Type: text/html; charset=utf-8
# Custom response
from django.http import HttpResponse
response = HttpResponse("Hello, 世界", content_type="text/plain; charset=utf-8")
Setting the Charset in Other Environments
# Flask
from flask import Flask, Response
app = Flask(__name__)
@app.route("/")
def index():
return Response("Hello, 世界", mimetype="text/plain; charset=utf-8")
// Node.js / Express
res.setHeader("Content-Type", "text/html; charset=utf-8");
res.send("<p>Hello, 世界</p>");
# Nginx — add charset to responses
charset utf-8;
charset_types text/html text/plain text/css application/javascript;
JSON and charset
RFC 7159 and RFC 8259 specify that JSON must be encoded in UTF-8, UTF-16, or UTF-32. In practice, application/json is almost always UTF-8, and the charset parameter is technically redundant but harmless:
Content-Type: application/json; charset=UTF-8
Modern HTTP APIs typically omit the charset for JSON since UTF-8 is assumed.
BOM (Byte Order Mark)
Some tools prepend a UTF-8 BOM (EF BB BF) to UTF-8 files. Browsers recognize this as a UTF-8 signal, but the BOM itself is an invisible character that can cause issues in JavaScript and JSON parsing. Prefer the charset header over relying on BOMs.
Quick Facts
| Property | Value |
|---|---|
| Header format | Content-Type: text/html; charset=UTF-8 |
| Priority vs. meta tag | HTTP header wins when both present |
| Recommended charset | UTF-8 for all new content |
| Default if omitted | Browser heuristics (unreliable) |
| JSON standard | UTF-8 assumed; charset optional |
| Django default | utf-8 via DEFAULT_CHARSET setting |
| Case sensitivity | charset parameter name is case-insensitive; value usually uppercase by convention |
관련 용어
웹 & HTML의 더 많은 용어
::before 및 ::after 의사 요소를 통해 유니코드 이스케이프를 사용하여 생성된 콘텐츠를 삽입하는 …
CSS properties (direction, writing-mode, unicode-bidi) controlling text layout direction. Works with Unicode …
HTML에서 문자를 텍스트로 표현하는 방식. 세 가지 형태: 이름(&), 십진수(&), 16진수(&). HTML …
ECMAScript Internationalization API providing locale-aware string comparison (Collator), number formatting (NumberFormat), date …
유니코드 도메인 이름을 ASCII 호환 인코딩으로 변환하여 xn-- 접두사가 붙은 ASCII 문자열로 …
CSS supports Unicode via escape sequences (\2713 for ✓), the content property …
XML 버전의 숫자 문자 참조: ✓ 또는 ✓. XML에는 명명된 엔티티가 5개(& …
비ASCII 유니코드 문자를 포함하는 도메인 이름으로, 내부적으로는 Punycode(xn--...)로 저장되지만 사용자에게는 유니코드로 표시됩니다. …
U+2060. 줄 바꿈을 방지하는 너비 없는 문자. 너비 없는 줄 바꿈 없는 …
사람이 읽기 쉬운 이름을 사용하는 HTML 엔티티: © → ©, — → …