Content-Type 文字セット
Embed This Widget
Add the script tag and a data attribute to embed this widget.
Embed via iframe for maximum compatibility.
<iframe src="https://unicodefyi.com/iframe/glossary/content-type-charset/" width="420" height="400" frameborder="0" style="border:0;border-radius:10px;max-width:100%" loading="lazy"></iframe>
Paste this URL in WordPress, Medium, or any oEmbed-compatible platform.
https://unicodefyi.com/glossary/content-type-charset/
Add a dynamic SVG badge to your README or docs.
[](https://unicodefyi.com/glossary/content-type-charset/)
Use the native HTML custom element.
レスポンスの文字エンコーディングを宣言するHTTPヘッダーパラメータ(Content-Type: text/html; charset=utf-8)。ドキュメント内のエンコーディング宣言より優先されます。
What Is the Content-Type Charset Parameter?
The Content-Type HTTP response header tells the browser two things: the media type of the response body (like text/html) and — optionally — the character encoding used to encode it. The encoding is specified via the charset parameter:
Content-Type: text/html; charset=UTF-8
Content-Type: text/plain; charset=ISO-8859-1
Content-Type: application/json; charset=UTF-8
Without this parameter, browsers must guess the encoding using heuristics, byte-order marks, or HTML meta tags — a process that can go wrong and produce garbled text (mojibake).
Why It Matters for Unicode
Unicode text is abstract code points. To transmit it over a network, you must encode those code points as bytes. UTF-8 is by far the most common encoding — it can represent every Unicode code point and is backwards-compatible with ASCII. If the server sends UTF-8 bytes but the browser interprets them as ISO-8859-1, multi-byte sequences will be misread.
Example: The string "café" encoded in UTF-8 is 63 61 66 C3 A9. If interpreted as ISO-8859-1:
- C3 → Ã
- A9 → ©
- Result displayed: café — classic mojibake.
Header vs. Meta Tag
For HTML, the encoding can be declared in two places:
<!-- HTML meta tag (in-document declaration) -->
<meta charset="UTF-8">
<!-- or legacy form: -->
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
# HTTP header (sent by server)
Content-Type: text/html; charset=UTF-8
The HTTP header takes precedence over the meta tag when both are present. The meta tag is a fallback for situations where the HTTP header is absent (e.g., opening a local HTML file).
Setting the Charset in Django
# Django sets UTF-8 by default in settings
DEFAULT_CHARSET = "utf-8"
# Response header automatically becomes:
# Content-Type: text/html; charset=utf-8
# Custom response
from django.http import HttpResponse
response = HttpResponse("Hello, 世界", content_type="text/plain; charset=utf-8")
Setting the Charset in Other Environments
# Flask
from flask import Flask, Response
app = Flask(__name__)
@app.route("/")
def index():
return Response("Hello, 世界", mimetype="text/plain; charset=utf-8")
// Node.js / Express
res.setHeader("Content-Type", "text/html; charset=utf-8");
res.send("<p>Hello, 世界</p>");
# Nginx — add charset to responses
charset utf-8;
charset_types text/html text/plain text/css application/javascript;
JSON and charset
RFC 7159 and RFC 8259 specify that JSON must be encoded in UTF-8, UTF-16, or UTF-32. In practice, application/json is almost always UTF-8, and the charset parameter is technically redundant but harmless:
Content-Type: application/json; charset=UTF-8
Modern HTTP APIs typically omit the charset for JSON since UTF-8 is assumed.
BOM (Byte Order Mark)
Some tools prepend a UTF-8 BOM (EF BB BF) to UTF-8 files. Browsers recognize this as a UTF-8 signal, but the BOM itself is an invisible character that can cause issues in JavaScript and JSON parsing. Prefer the charset header over relying on BOMs.
Quick Facts
| Property | Value |
|---|---|
| Header format | Content-Type: text/html; charset=UTF-8 |
| Priority vs. meta tag | HTTP header wins when both present |
| Recommended charset | UTF-8 for all new content |
| Default if omitted | Browser heuristics (unreliable) |
| JSON standard | UTF-8 assumed; charset optional |
| Django default | utf-8 via DEFAULT_CHARSET setting |
| Case sensitivity | charset parameter name is case-insensitive; value usually uppercase by convention |
関連用語
Web & HTML のその他の用語
::beforeおよび::after疑似要素でUnicodeエスケープを使って生成コンテンツを挿入するCSSプロパティ:content: '\2713'は✓を挿入します。
CSS properties (direction, writing-mode, unicode-bidi) controlling text layout direction. Works with Unicode …
HTMLで文字をテキスト表現する方式。3つの形式:名前(&)・十進数(&)・16進数(&)。HTMLの構文と衝突する文字に必須です。
ECMAScript Internationalization API providing locale-aware string comparison (Collator), number formatting (NumberFormat), date …
Unicode ドメイン名をxn--プレフィックス付きのASCII文字列に変換するASCII互換エンコーディング。münchen.de → xn--mnchen-3ya.de。
CSS supports Unicode via escape sequences (\2713 for ✓), the content property …
XMLバージョンの数値文字参照:✓または✓。XMLには名前付きエンティティが5個(& < > " ')しかありませんが、HTML5は2,231個あります。
デフォルトの絵文字表示の代わりに、通常は異体字セレクター15(U+FE0E)を使って文字をモノクロのテキストグリフでレンダリングすること。
URLの非ASCII文字と予約文字を各バイトを%XXで置き換えてエンコードします。まずUTF-8に変換し、各バイトをパーセントエンコードします:é → %C3%A9。
U+2060。改行を防ぐゼロ幅文字。ゼロ幅ノーブレークスペースとしてのU+FEFF(BOM)の現代的な代替です。