【php】mbstringで使用可能な文字コード名とエイリアス名
問題
mb_convert_encoding で指定できる文字コードってたくさんありますね!
答え
以下のスクリプトで一覧を作った結果
<?php foreach (mb_list_encodings() as $e) { echo $e . "\t" . @mb_preferred_mime_name($e) . "\t" . implode(', ', mb_encoding_aliases($e)) . "\n"; }
php5.3.3 環境では以下の通り。
pass none auto unknown wchar byte2be byte2le byte4be byte4le BASE64 BASE64 UUENCODE x-uuencode HTML-ENTITIES HTML-ENTITIES HTML, html Quoted-Printable Quoted-Printable qprint 7bit 7bit 8bit 8bit binary UCS-4 UCS-4 ISO-10646-UCS-4, UCS4 UCS-4BE UCS-4BE UCS-4LE UCS-4LE UCS-2 UCS-2 ISO-10646-UCS-2, UCS2, UNICODE UCS-2BE UCS-2BE UCS-2LE UCS-2LE UTF-32 UTF-32 utf32 UTF-32BE UTF-32BE UTF-32LE UTF-32LE UTF-16 UTF-16 utf16 UTF-16BE UTF-16BE UTF-16LE UTF-16LE UTF-8 UTF-8 utf8 UTF-7 UTF-7 utf7 UTF7-IMAP ASCII US-ASCII ANSI_X3.4-1968, iso-ir-6, ANSI_X3.4-1986, ISO_646.irv:1991, US-ASCII, ISO646-US, us, IBM367, cp367, csASCII EUC-JP EUC-JP EUC, EUC_JP, eucJP, x-euc-jp SJIS Shift_JIS x-sjis, SHIFT-JIS eucJP-win EUC-JP eucJP-open, eucJP-ms SJIS-win Shift_JIS SJIS-open, SJIS-ms CP932 Shift_JIS MS932, Windows-31J, MS_Kanji CP51932 CP51932 cp51932 JIS ISO-2022-JP ISO-2022-JP ISO-2022-JP ISO-2022-JP-MS ISO-2022-JP ISO2022JPMS Windows-1252 Windows-1252 cp1252 Windows-1254 Windows-1254 CP1254, CP-1254, WINDOWS-1254 ISO-8859-1 ISO-8859-1 ISO_8859-1, latin1 ISO-8859-2 ISO-8859-2 ISO_8859-2, latin2 ISO-8859-3 ISO-8859-3 ISO_8859-3, latin3 ISO-8859-4 ISO-8859-4 ISO_8859-4, latin4 ISO-8859-5 ISO-8859-5 ISO_8859-5, cyrillic ISO-8859-6 ISO-8859-6 ISO_8859-6, arabic ISO-8859-7 ISO-8859-7 ISO_8859-7, greek ISO-8859-8 ISO-8859-8 ISO_8859-8, hebrew ISO-8859-9 ISO-8859-9 ISO_8859-9, latin5 ISO-8859-10 ISO-8859-10 ISO_8859-10, latin6 ISO-8859-13 ISO-8859-13 ISO_8859-13 ISO-8859-14 ISO-8859-14 ISO_8859-14, latin8 ISO-8859-15 ISO-8859-15 ISO_8859-15 ISO-8859-16 ISO-8859-16 ISO_8859-16 EUC-CN CN-GB CN-GB, EUC_CN, eucCN, x-euc-cn, gb2312 CP936 CP936 CP-936, GBK HZ HZ-GB-2312 EUC-TW EUC-TW EUC_TW, eucTW, x-euc-tw BIG-5 BIG5 CN-BIG5, BIG-FIVE, BIGFIVE, CP950 EUC-KR EUC-KR EUC_KR, eucKR, x-euc-kr UHC UHC CP949 ISO-2022-KR ISO-2022-KR Windows-1251 Windows-1251 CP1251, CP-1251, WINDOWS-1251 CP866 CP866 CP866, CP-866, IBM-866 KOI8-R KOI8-R KOI8-R, KOI8R KOI8-U KOI8-U KOI8-U, KOI8U ArmSCII-8 ArmSCII-8 ArmSCII-8, ArmSCII8, ARMSCII-8, ARMSCII8 CP850 CP850 CP850, CP-850, IBM-850 JIS-ms ISO-2022-JP CP50220 ISO-2022-JP CP50220raw ISO-2022-JP CP50221 ISO-2022-JP CP50222 ISO-2022-JP
php7RC6では以下の通り。
pass none auto unknown wchar byte2be byte2le byte4be byte4le BASE64 BASE64 UUENCODE x-uuencode HTML-ENTITIES HTML-ENTITIES HTML, html Quoted-Printable Quoted-Printable qprint 7bit 7bit 8bit 8bit binary UCS-4 UCS-4 ISO-10646-UCS-4, UCS4 UCS-4BE UCS-4BE UCS-4LE UCS-4LE UCS-2 UCS-2 ISO-10646-UCS-2, UCS2, UNICODE UCS-2BE UCS-2BE UCS-2LE UCS-2LE UTF-32 UTF-32 utf32 UTF-32BE UTF-32BE UTF-32LE UTF-32LE UTF-16 UTF-16 utf16 UTF-16BE UTF-16BE UTF-16LE UTF-16LE UTF-8 UTF-8 utf8 UTF-7 UTF-7 utf7 UTF7-IMAP ASCII US-ASCII ANSI_X3.4-1968, iso-ir-6, ANSI_X3.4-1986, ISO_646.irv:1991, US-ASCII, ISO646-US, us, IBM367, IBM-367, cp367, csASCII EUC-JP EUC-JP EUC, EUC_JP, eucJP, x-euc-jp SJIS Shift_JIS x-sjis, SHIFT-JIS eucJP-win EUC-JP eucJP-open, eucJP-ms EUC-JP-2004 EUC-JP EUC_JP-2004 SJIS-win Shift_JIS SJIS-open, SJIS-ms SJIS-Mobile#DOCOMO Shift_JIS SJIS-DOCOMO, shift_jis-imode, x-sjis-emoji-docomo SJIS-Mobile#KDDI Shift_JIS SJIS-KDDI, shift_jis-kddi, x-sjis-emoji-kddi SJIS-Mobile#SOFTBANK Shift_JIS SJIS-SOFTBANK, shift_jis-softbank, x-sjis-emoji-softbank SJIS-mac Shift_JIS MacJapanese, x-Mac-Japanese SJIS-2004 Shift_JIS SJIS2004, Shift_JIS-2004 UTF-8-Mobile#DOCOMO UTF-8 UTF-8-DOCOMO, UTF8-DOCOMO UTF-8-Mobile#KDDI-A UTF-8 UTF-8-Mobile#KDDI-B UTF-8 UTF-8-Mobile#KDDI, UTF-8-KDDI, UTF8-KDDI UTF-8-Mobile#SOFTBANK UTF-8 UTF-8-SOFTBANK, UTF8-SOFTBANK CP932 Shift_JIS MS932, Windows-31J, MS_Kanji CP51932 CP51932 cp51932 JIS ISO-2022-JP ISO-2022-JP ISO-2022-JP ISO-2022-JP-MS ISO-2022-JP ISO2022JPMS GB18030 GB18030 gb-18030, gb-18030-2000 Windows-1252 Windows-1252 cp1252 Windows-1254 Windows-1254 CP1254, CP-1254, WINDOWS-1254 ISO-8859-1 ISO-8859-1 ISO_8859-1, latin1 ISO-8859-2 ISO-8859-2 ISO_8859-2, latin2 ISO-8859-3 ISO-8859-3 ISO_8859-3, latin3 ISO-8859-4 ISO-8859-4 ISO_8859-4, latin4 ISO-8859-5 ISO-8859-5 ISO_8859-5, cyrillic ISO-8859-6 ISO-8859-6 ISO_8859-6, arabic ISO-8859-7 ISO-8859-7 ISO_8859-7, greek ISO-8859-8 ISO-8859-8 ISO_8859-8, hebrew ISO-8859-9 ISO-8859-9 ISO_8859-9, latin5 ISO-8859-10 ISO-8859-10 ISO_8859-10, latin6 ISO-8859-13 ISO-8859-13 ISO_8859-13 ISO-8859-14 ISO-8859-14 ISO_8859-14, latin8 ISO-8859-15 ISO-8859-15 ISO_8859-15 ISO-8859-16 ISO-8859-16 ISO_8859-16 EUC-CN CN-GB CN-GB, EUC_CN, eucCN, x-euc-cn, gb2312 CP936 CP936 CP-936, GBK HZ HZ-GB-2312 EUC-TW EUC-TW EUC_TW, eucTW, x-euc-tw BIG-5 BIG5 CN-BIG5, BIG-FIVE, BIGFIVE CP950 BIG5 EUC-KR EUC-KR EUC_KR, eucKR, x-euc-kr UHC UHC CP949 ISO-2022-KR ISO-2022-KR Windows-1251 Windows-1251 CP1251, CP-1251, WINDOWS-1251 CP866 CP866 CP866, CP-866, IBM866, IBM-866 KOI8-R KOI8-R KOI8-R, KOI8R KOI8-U KOI8-U KOI8-U, KOI8U ArmSCII-8 ArmSCII-8 ArmSCII-8, ArmSCII8, ARMSCII-8, ARMSCII8 CP850 CP850 CP850, CP-850, IBM850, IBM-850 JIS-ms ISO-2022-JP ISO-2022-JP-2004 ISO-2022-JP-2004 ISO-2022-JP-MOBILE#KDDI ISO-2022-JP ISO-2022-JP-KDDI CP50220 ISO-2022-JP CP50220raw ISO-2022-JP CP50221 ISO-2022-JP CP50222 ISO-2022-JP
php5.3とphp7は -2004 と付く文字コードと、MOBILEと付く文字コードの有無が違う程度だった。
kalvo 2021年10月15日 19:45
Windows-31J文字列をUTF-8に変換する方法は?
fytko 2022年5月17日 18:56
Windows-31J文字列をUTF-8に変換する方法は?