Talk:Code page
The article Code page 3846 was nominated for deletion. The discussion was closed on 17 December 2024 with a consensus to merge the content into Code page. If you find that such action has not been taken promptly, please consider assisting in the merger instead of re-nominating the article for deletion. To discuss the merger, please use this talk page. Do not remove this template after completing the merger. A bot will replace it with {{afd-merged-from}}. |
This is the talk page for discussing improvements to the Code page article. This is not a forum for general discussion of the article's subject. |
Article policies
|
Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL |
Archives: 1 |
This article is rated C-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | |||||||||||||||||||||||||||||||
|
This article was nominated for merging with Character encoding on February 27, 2014. The result of the discussion was to keep the articles separate. |
|
|
Code page vs Encoding
[edit]After reading the article, I failed to understand how are code page and encoding different, which is claimed by the article. --Voidvector (talk) 13:49, 31 August 2008 (UTC)
Reply: think of a codepage as a list of characters, and an encoding as a way that the characters are stored.
For instance, the Unicode character set has a trademark symbol at position 8482 (2122 hex).
So the codepage simply says: 8482 -> TM.
Now if this is encoded as UTF-32, this is a 32-bit word with value 8482. If it's encoded as UTF-16LE, it would be two bytes with values 34 and 33.
8-bit codepages don't have different encodings: a byte is a byte. So if a codepage has a TM at position 153 for instance, that means the encoding is the value 153 for that character. So the encoding matches the codepage listing byte for byte.
Pim 2 (talk) 15:00, 22 May 2009 (UTC)
I totally agree with Voidvector, the difference between a code page and a character encoding is still not clear, even with Pim 2 explanation. The "character encoding" definition is any number of pairs { character + code }, thus it contradicts Pim 2 "the encoding matches the codepage listing byte for byte". Thus code page = character encoding, just the name is different Sandrarossi (talk) 10:17, 6 August 2009 (UTC)
- I have got to disagree with Pim. A code page doesn't just specify a character set, it specifies how this set is encoded as well. For example, code page 932 doesn't just specify the JIS character set, but also how it is encoded with single bytes, lead bytes and trail bytes. For a more modern example, different encodings of Unicode have been assigned code page numbers. And old code pages can be retroactively considered to be encodings of subsets of Unicode as well. — Preceding unsigned comment added by 82.139.82.82 (talk) 15:13, 21 November 2015 (UTC)
- I added a link to the relevant section of Character encoding from the intro. -- Beland (talk) 05:35, 25 July 2020 (UTC)
MIK
[edit]MIK is almost certainly Code Page 879. ISO 8859-11 is almost certainly Code Page 873.
Code page 854
[edit]The Spanish code page 854 is not from IBM, but what was the code page layout? IBM's code page 854 was probably DOS Latin 4, continuing the sequence created by code pages 852, 853, and 855. Alexlatham96 (talk) 20:23, 12 May 2020 (UTC)
- There was a DOS codepage for Spanish/Catalan that added À Á È Í Ï Ò Ó Ú Ŀ ŀ; it was supported by Wyse terminals, but I don't know exact layout. 178.49.152.92 (talk) 06:41, 10 June 2023 (UTC)
- It's been found out that this is Code page 220, and its layout was found. Code page 854 is only listed as numbered in a 1992 book by Thom Hogan (needs to be checked if the English version of the book also shows code page 854).Alexlatham96 (talk) 04:36, 13 June 2024 (UTC)
Notability of individual articles
[edit]Given the decision at Wikipedia:Articles for deletion/Code page 875 to move nearly all articles on EBCDIC code pages to Wikibooks, are there other articles linked from this page that should be moved as well? -- Beland (talk) 17:05, 20 July 2020 (UTC)
Conflict between Microsoft and IBM codepage 1200
[edit]In the Microsoft part, it says:
1200 – UTF-16LE Unicode (little-endian)
1201 – UTF-16BE Unicode (big-endian)
In the IBM part, it says:
1200 – UTF-16BE Unicode (big-endian) with PUA
1201 – UTF-16BE Unicode (big-endian)
1202 – UTF-16LE Unicode (little-endian) with IBM PUA
1203 – UTF-16LE Unicode (little-endian)
Making a clear anti definition with BE and LE conflicting around 1200 / 1201.
So, what is this mess? 77.159.196.124 (talk) 13:41, 29 August 2022 (UTC)
IBM PUA mapping
[edit]Where can I find the full IBM PUA mapping? For example code page 1056 has many PUA characters.Alexlatham96 (talk) 03:17, 1 May 2023 (UTC)
- Found it! Alexlatham96 (talk) 04:30, 13 June 2024 (UTC)
"page numbers in the IBM standard character set manual"
[edit]Until somebody can come up with a specific reference to this manual, this should be regarded as apocryphal. Note discussion at https://retrocomputing.stackexchange.com/questions/14780/is-the-ibm-standard-character-set-manual-around MarkMLl (talk) 20:59, 31 December 2023 (UTC)