Web & HTML

Content-Type 文字セット

レスポンスの文字エンコーディングを宣言するHTTPヘッダーパラメータ(Content-Type: text/html; charset=utf-8)。ドキュメント内のエンコーディング宣言より優先されます。

· 更新日

What Is the Content-Type Charset Parameter?

The Content-Type HTTP response header tells the browser two things: the media type of the response body (like text/html) and — optionally — the character encoding used to encode it. The encoding is specified via the charset parameter:

Content-Type: text/html; charset=UTF-8
Content-Type: text/plain; charset=ISO-8859-1
Content-Type: application/json; charset=UTF-8

Without this parameter, browsers must guess the encoding using heuristics, byte-order marks, or HTML meta tags — a process that can go wrong and produce garbled text (mojibake).

Why It Matters for Unicode

Unicode text is abstract code points. To transmit it over a network, you must encode those code points as bytes. UTF-8 is by far the most common encoding — it can represent every Unicode code point and is backwards-compatible with ASCII. If the server sends UTF-8 bytes but the browser interprets them as ISO-8859-1, multi-byte sequences will be misread.

Example: The string "café" encoded in UTF-8 is 63 61 66 C3 A9. If interpreted as ISO-8859-1: - C3à - A9© - Result displayed: café — classic mojibake.

Header vs. Meta Tag

For HTML, the encoding can be declared in two places:

<!-- HTML meta tag (in-document declaration) -->
<meta charset="UTF-8">
<!-- or legacy form: -->
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
# HTTP header (sent by server)
Content-Type: text/html; charset=UTF-8

The HTTP header takes precedence over the meta tag when both are present. The meta tag is a fallback for situations where the HTTP header is absent (e.g., opening a local HTML file).

Setting the Charset in Django

# Django sets UTF-8 by default in settings
DEFAULT_CHARSET = "utf-8"

# Response header automatically becomes:
# Content-Type: text/html; charset=utf-8

# Custom response
from django.http import HttpResponse
response = HttpResponse("Hello, 世界", content_type="text/plain; charset=utf-8")

Setting the Charset in Other Environments

# Flask
from flask import Flask, Response
app = Flask(__name__)

@app.route("/")
def index():
    return Response("Hello, 世界", mimetype="text/plain; charset=utf-8")
// Node.js / Express
res.setHeader("Content-Type", "text/html; charset=utf-8");
res.send("<p>Hello, 世界</p>");
# Nginx — add charset to responses
charset utf-8;
charset_types text/html text/plain text/css application/javascript;

JSON and charset

RFC 7159 and RFC 8259 specify that JSON must be encoded in UTF-8, UTF-16, or UTF-32. In practice, application/json is almost always UTF-8, and the charset parameter is technically redundant but harmless:

Content-Type: application/json; charset=UTF-8

Modern HTTP APIs typically omit the charset for JSON since UTF-8 is assumed.

BOM (Byte Order Mark)

Some tools prepend a UTF-8 BOM (EF BB BF) to UTF-8 files. Browsers recognize this as a UTF-8 signal, but the BOM itself is an invisible character that can cause issues in JavaScript and JSON parsing. Prefer the charset header over relying on BOMs.

Quick Facts

Property Value
Header format Content-Type: text/html; charset=UTF-8
Priority vs. meta tag HTTP header wins when both present
Recommended charset UTF-8 for all new content
Default if omitted Browser heuristics (unreliable)
JSON standard UTF-8 assumed; charset optional
Django default utf-8 via DEFAULT_CHARSET setting
Case sensitivity charset parameter name is case-insensitive; value usually uppercase by convention

関連用語

Web & HTML のその他の用語

CSS content プロパティ

::beforeおよび::after疑似要素でUnicodeエスケープを使って生成コンテンツを挿入するCSSプロパティ:content: '\2713'は✓を挿入します。

CSS Text Direction

CSS properties (direction, writing-mode, unicode-bidi) controlling text layout direction. Works with Unicode …

HTML エンティティ

HTMLで文字をテキスト表現する方式。3つの形式:名前(&amp;)・十進数(&#38;)・16進数(&#x26;)。HTMLの構文と衝突する文字に必須です。

JavaScript Intl API

ECMAScript Internationalization API providing locale-aware string comparison (Collator), number formatting (NumberFormat), date …

Punycode

Unicode ドメイン名をxn--プレフィックス付きのASCII文字列に変換するASCII互換エンコーディング。münchen.de → xn--mnchen-3ya.de。

Unicode in CSS

CSS supports Unicode via escape sequences (\2713 for ✓), the content property …

XML 文字参照

XMLバージョンの数値文字参照:&#x2713;または&#10003;。XMLには名前付きエンティティが5個(&amp; &lt; &gt; &quot; &apos;)しかありませんが、HTML5は2,231個あります。

テキスト表示

デフォルトの絵文字表示の代わりに、通常は異体字セレクター15(U+FE0E)を使って文字をモノクロのテキストグリフでレンダリングすること。

パーセントエンコーディング (URL エンコーディング)

URLの非ASCII文字と予約文字を各バイトを%XXで置き換えてエンコードします。まずUTF-8に変換し、各バイトをパーセントエンコードします:é → %C3%A9。

ワードジョイナー

U+2060。改行を防ぐゼロ幅文字。ゼロ幅ノーブレークスペースとしてのU+FEFF(BOM)の現代的な代替です。