Talk:Assembly language
This is the talk page for discussing improvements to the Assembly language article. This is not a forum for general discussion of the article's subject. |
Article policies
|
Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL |
Archives: 1Auto-archiving period: 12 months |
Assembly language was nominated as a Engineering and technology good article, but it did not meet the good article criteria at the time (September 17, 2020). There are suggestions on the review page for improving the article. If you can improve it, please do; it may then be renominated. |
This level-4 vital article is rated B-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | ||||||||||||||||||||||||||||||||||
|
Pronunciation
[edit]Can you add the pronunciation of assembly? In particular, assembly as in light or as in literature? And where is the stress? Assembly or assembly? Thanks--Stemby (talk) 10:43, 17 June 2008 (UTC)
- It's IPA: /əˈsɛmb.lɪi/. See Wikitionary. —Ben FrantzDale (talk) 12:09, 17 June 2008 (UTC)
- Also, IBM documentation pretty much all calls it Assembler Language, even though everyone I know call the program an assembler, but the language assembly. DEC called theirs names like Macro-11 in the titles of manuals, but assembly in the detailed description. I am not sure that there is a standard as to how to accent the pronunciation. Gah4 (talk) 21:14, 20 April 2020 (UTC)
http://en.wikipedia.org/wiki/List_of_assemblers <> Comparison of assemblers
[edit]In "See also" section the point "List of assemblers" linked to Comparison of assemblers! And where the link to MACRO-11 should be added? --Tim32 (talk) 19:20, 22 June 2008 (UTC)
Syntax section
[edit]Since this section has to deal mostly with x86 assembly language, I suggest that it be moved there. ( rCX (talk) 03:59, 4 July 2008 (UTC) )
- I moved it. rCX (talk) 00:57, 23 July 2008 (UTC)
FAQ
[edit]The representation in here does it refer to the hexadecimal address that machines use, such as when Blue Screen of Death error, they consists of a particulara hardware address in hexadecimal form. (I have very little macros programming foundation.)
This representation is usually defined by the hardware manufacturer, and is based on abbreviations (called mnemonics) that help the programmer remember individual instructions, registers, etc
By the way some interpreter use to translate XML documents according to XML schema, are they consider assemblers?
--Ramu50 (talk) 03:24, 13 July 2008 (UTC)
This is mostly OT, but I can't stand to let a question on BSODs go unanswered. :) In most blue screen messages, one or more of the hex numbers you see are virtual addresses, and often one of them will be the virtual address of an instruction. If you could look in memory at that address you would find the numeric (usually displayed in hex) coding of the actual instruction (or "opcode"). The blue screen however never displays the instructions themselves, only their addresses. The documentation that comes with the Windows Debugging Tools will tell you all about how to interpret blue screen data. The debugger can of course display the hex coding for the instructions and can also "disassemble" them, turning them back into assembly language mnemonics. There is a decent quick tutorial on using the debugger here: [1]. If you want to know more than most of us need to know about how Intel x86 instructions are encoded, download the three volumes of "IA32 Architecture Software Developer's Manual" from the Intel site, particularly volumes 1 and 2. You don't need to know the instruction coding in depth unless you're writing a compiler, an assembler, a debugger, or the microcode for the CPU itself, but it helps if you understand the general principles and if you can recognize a few common patterns. I can also highly recommend Matt Pietrek's articles from MSJ, "Just Enough Assembly to Get By", for a look at how the x86 instruction set is actually used by everyday code. There's nothing there on how the instructions are coded into hex, though, only on the assembler mnemonics.
I never heard of an XML interpreter called an assembler - they're usually called interpreters, or some more specialized name depending on their function. Jeh (talk) 08:02, 16 July 2008 (UTC)
-- Fasmlib keeps cropping up throughout wikipedia Assembler articles ---
Seems to me the author of Fasmlib is quietly seeking to advertise his product through wikipedia. Fasmlib isnt even a working product as far as I can tell, nor is it unique in what it attempts to deliver ( Randolph Hyde has done a similar, yet complete work in the past ). —Preceding unsigned comment added by 217.42.215.104 (talk) 21:45, 16 November 2008 (UTC)
Ext links (Absolute Begineers guide to assembly language
[edit]Hi MrOllie. I'm publishing (with permission) Doug Dingus' comprehensive guide to Assembly Language for the Absolute Begineer. Not trying to sell anything, but I think it would be a valuable external resource that goes into detail beyond what would be appropriate for an article. This is inline with the External Link guidlines. —Preceding unsigned comment added by Nmcclana (talk • contribs) 02:07, 18 February 2009 (UTC)
- You're publishing? Please do not add links to your own site. See WP:EL and especially WP:COI. - MrOllie (talk) 15:42, 18 February 2009 (UTC)
- This EL doesn't promote anything other than a better understanding of assembly language. And linking to an article you've reprinted doesn't qualify for automatic deletion. MrOllie thinks an EL to 'Assembly Language for the Absolute Beginner' is spam, a COI, and requires deletion without discussion. I think it is a useful guide for someone who would read about assembly language and want to learn more. Any other opinions? --Nmcclana (talk) 19:16, 18 February 2009 (UTC)
- Quoting WP:EL - 'In line with Wikipedia policies, you should avoid linking to a site that you own, maintain, or represent — even if WP guidelines seem to imply that it may otherwise be linked.' - MrOllie (talk) 19:26, 18 February 2009 (UTC)
Ext links (Pagetable.com)
[edit]Pagetable.com was removed from the external links/software section - pagetable.com is a unique resource for assembly language analysis and history/evolution of the language across various architectures and compilers; I believe it comes into category 3, Wikipedia:EL#What_should_be_linked. One of the admins should examine the content of pagetable.com and other links and not just delete en masse because they don't have either the expertise or time to identify useful resources 121.45.167.176 (talk) 20:29, 19 March 2010 (UTC)
one pass / two pass assemblers
[edit]as per this edit that was reverted [2], I think it would add something to have a brief description of one pass vs two pass assemblers. I know what they are, but don't have any references at the moment (although I'm sure I could find some). However, I have not heard of "Jove" pass assembly? Does anyone know what this is? --stmrlbs|talk 18:30, 3 June 2009 (UTC)
- I've never heard of a Jove pass assembler either. A Google search does not turn up anything likely. Shouldn't be too hard to find sources on one and two pass assembers, though. —Preceding unsigned comment added by Yworo (talk • contribs) 19:22, 3 June 2009 (UTC)
- well, it was a bit harder than I thought to find a reference. I found a lot of interesting class notes, each with a little different interpretation. However, I don't think class notes are RS. But, I put in a brief description of the basic types of each, and the main differences. I think currently, the 2 kind of blend into each other because of more sophisticated one-pass assemblers, which build tables which allow them to plug in addresses that are forward referenced. But, imo, this is hazy as to whether it is a one-pass or a two-pass or something in between. Plus, they have multi-pass assemblers, however that is just more passes to do more sophisticated processing of the source. I left that out as I think just a basic definition will do for this article.
However, if anyone feels that they can improve it.. be bold!! --stmrlbs|talk 02:37, 4 June 2009 (UTC)
- I believe that historically it was more common to have 3 or more passes than it was to have only one. I've revised the text to reflect that, and also briefly mentioned the possible need for an extra pass when doing peephole optimization. Shmuel (Seymour J.) Metz Username:Chatul (talk) 19:14, 11 December 2011 (UTC)
- I believe, from the small memory days, assemblers (and compilers) had many phases, usually separate sets of code read from disk. It seems that OS/360 Assembler F[1] seems to have eight phases. It might be that all macro expansion is done before the first pass as described here. I believe that conditional assembly is done at the same time. Also, there is a phase for writing out error messages, which shouldn't count as a pass. It uses (up to) three temporary files, reading and writing as it goes through phases. Gah4 (talk) 01:37, 21 April 2020 (UTC)
- I believe that historically it was more common to have 3 or more passes than it was to have only one. I've revised the text to reflect that, and also briefly mentioned the possible need for an extra pass when doing peephole optimization. Shmuel (Seymour J.) Metz Username:Chatul (talk) 19:14, 11 December 2011 (UTC)
- It varied all over the landscape, but prior to S/360 the number of passes, even for macro-assemblers, was usually small. The FORTRAN II Assembly Program (FAP) had only two passes, with a symbol-table sort in between. 7070/7074 Autocoder had three phases. Lots of others had only two.
References
- ^ Program Logic IBM System/360 Operating System Assembler (F) (PDF). Program Logic (Third ed.). IBM. December 1970. GY26-3700-2. Retrieved 21 April 2020 – via bitsavers.
Manuals, etc.
[edit]The list of manuals, tutorials, etc. seems too long. I would argue that only links to sites/pages which are about assembly language in general should go here. Links for specific assembly languages for specific machines or chips belong on the article about that machine or chip. —Preceding unsigned comment added by Yworo (talk • contribs) 23:06, 9 June 2009 (UTC)
I'd also propose that we link to the Open Directory page at http://www.dmoz.org/Computers/Programming/Languages/Assembly/ Wikipedia can never be a complete directory to pages about every assembly language and we should not try to be. A link to Open Directory will give the read a much better overview and categorization of assembly languages.
- I believe that references for specific assemblers belong here and that they do not belong in articles on hardware.
- OTOH, some of the references are about architecture and hardware and have nothing to do with assembler languages; I believe that those should be moved. Shmuel (Seymour J.) Metz Username:Chatul (talk) 12:27, 10 August 2010 (UTC)
separate Manuals / Tutorials
[edit]I think this article would be of more benefit to the public if there was a section for manuals, and a section for tutorials. A manual would be any assembly manual put out by the company that made that particular computer (IBM/Unix/etc.). Online manuals are really of great benefit to anyone interested or working in the field. Tutorials would be those tutorials written by experienced people/teachers/universities/etc. Right now.. it is all jumbled together. Not very user friendly. --stmrlbs|talk 23:10, 9 June 2009 (UTC)
- See my update above. I think we should just link to Open Directory. —Preceding unsigned comment added by Yworo (talk • contribs) 23:10, 9 June 2009 (UTC)
- Or rather, we should link only to pages/sites about assembly language in general and then provide a link to OD for those wanting to find manuals/tutorials for specific architectures. —Preceding unsigned comment added by Yworo (talk • contribs) 23:13, 9 June 2009 (UTC)
- Yworo, I took a look at the Open directory, and it is a nice idea, but I couldn't find some basic things for IBM like the online IBM manuals for assembly language. I am not against putting a link to there, it is a good source - but I don't think it is a complete source (wikipedia isn't either). As for putting the assembler manuals on the 370 system architechure article, then maybe we could put a link to it from here (and the same with other architechures). But I still think that if a company that manufactures the hardware has online manuals which programmers use as reference to program that machine, then perhaps a link to all the manuals from that particular architecture would be great.
- --stmrlbs|talk 23:32, 9 June 2009 (UTC)
- I don't believe we should be linking to the IBM manuals from this article. I think they should be link from the article about the machine itself. This article is not about specific assembly languages. Maybe there should be subarticles about specific assembly languages, the links would be appropriate there.... —Preceding unsigned comment added by Yworo (talk • contribs) 23:36, 9 June 2009 (UTC)
- Then how about something like, There are assembly languages for these architectures: then list the architectures, with wiki links to the articles on the different architectures. Then have a link to the online manuals from the different architectures. --stmrlbs|talk 23:56, 9 June 2009 (UTC)
- also, Yworo, if you follow your posts with --~~~~ , wiki will automatically sign the post, so that people will know that you wrote the post. --stmrlbs|talk 23:58, 9 June 2009 (UTC)
- Yah, I've just been informed on my talk page about the tildes. Thanks. You may have something there about how to do this. The current list is just so disorganized. Maybe some sort of a "list" subarticle would be better. And what about obsolete architectures, do we list them too? I took a machine language/assembly language class many years ago which used Burroughs, PDP-11, and CDC Cyber assembly languages as examples. PDP-11 assembly may well still be an excellent example of an orthogonally designed assembly language, but the other too are certainly obsolete. Do we link to info about say 8008 assembly? I still think it might be better to put the links on the pages about the machines, but perhaps we could create a see also section which mentions that that is where to find the assembly language reference links? Or? There just has to be a better solution than the current one... Yworo (talk) 13:13, 10 June 2009 (UTC)
- Yworo, I think we should just start with what we have, rather than worrying about getting everything in at once. If we set up the structure, then I think people will add in the right place if they can see it easily. How about if I set up something tonight, then you can give me your opinion, or change/tweak it, then we can go from there. Ok? --stmrlbs|talk 18:27, 10 June 2009 (UTC)
- Sure thing, go ahead. I'm the newcomer here... :-) Yworo (talk) 19:07, 10 June 2009 (UTC)
- Yworo, I took a look around, and I see that this has already been created: List_of_assemblers. At the bottom of the Assembler article, it has this mixed in with other links here: Assembly_language#See_also. I think we should the link to a section by itself with a little paragraph explaining that this is a link of the different assemblers for the different machines. Then, maybe organize the rest of the links so they make better sense. What do you think? --stmrlbs|talk 09:11, 11 June 2009 (UTC)
Yworo, I created a new section: Assemblers#List_of_assemblers_for_different_computer_architectures which just has an introductory sentence explaining that a list of assemblers and architectures is on an associated page. Now, I think we can go through the links, and add those links to different assemblers into the table on the other page (if they are not already there), and take them off this page. The table is a more user friendly presentation of the information, than a bunch of links. You are right in that all the jumbled links with no organization is not good. If you need any help figuring out how to put something in the table, let me know. I will be glad to help --stmrlbs|talk 19:29, 11 June 2009 (UTC)
- OK, let's make sure I have this straight, most of the links, the ones specific to architechtures, will go into the table on the subpage? And we'll have only a few general links on this one? I think that solves it quite nicely and is much more navigable and user-friendly. Good work. Yworo (talk) 13:12, 12 June 2009 (UTC)
- yes, the table give like a crossroads link page between the different articles - between the assembly language, and the architecture. For each specific language, the table points to a wikipedia article on that language. For each architecture, there is a wikipedia article on that architecture.
- So, I was thinking of going back to what you originally suggested, with a variation. For each wiki article on a language, add a subsection: manuals - then add the links to the assembly manuals there. And for each architecture, add a section for manuals. This will streamline this article, which, like you said, shouldn't have every assembler manual referenced in this article, and it will provide a way for a person to see all assemblers and their associated architecture, and point them to the correct articles for more information, and the manuals. If someone adds a manual here, we just move it and explain on the talk page and edit summary - that way the editor is still encouraged to contribute to the article.
- I think this will make the wikipedia articles on this subject much more user friendly, and provide a nice resource for the public. I will start with the IBM Z/Architecture Assembler (HLASM) that is my ballpark. --stmrlbs|talk 18:14, 12 June 2009 (UTC)
Modulo
[edit]I made a long research to use the Modulo operator in Assembly language and the closest I found was the DIV operator however it's not available on the simple educational Assembler PEP8 [3] (French operators instruction Chapitre 7 in http://www.er.uqam.ca/nobel/k20250/Notes_cours.html).
Is there a simpler way of doing a modulo ? Perhaps doing bitwise operations ? --DynV (talk) 19:29, 12 June 2009 (UTC)
Only thing I can think of is performing the DIV operation to return an integral value and multiplying it with the divisor then subtracting it from the dividend. This is probably how it's done physically in architectures. ChazZeromus (talk) 18:05, 13 August 2009 (UTC)
Immediate encodings
[edit]I thought I'd put a little section about the ModR/M operand specifier byte underneath the opcode section. Please edit if their are any erroneous typings in my words lol. ChazZeromus (talk) 18:11, 13 August 2009 (UTC)
- I reverted this change. You are describing the underlying machine language of x86, not assembly language. The ModR/M byte details, etc., aren't part of and are not reflected in the assembly language. You could however see if the x86 architecture article could use this material. Jeh (talk) 18:14, 13 August 2009 (UTC)
- Hm, maybe there should be a section approaching the architecture of the assembler itself? This article describes the different language, and so I thought I'd put an example, but I forgot about the assembler part! Alright, I'll see what I can plug around in the x86 article. 72.91.245.68 (talk) 21:45, 13 August 2009 (UTC)
- Some assemblers are remarkably flexible when it comes to typing in immediate code. I can't speak for the x86 series of microprocessors as I have no real experience with that microprocessor, but there are assemblers around for 6502s that let you type the following kind of stuff in:
- hex 41424344 45464748 ; lowercase PETASCII clumped together into 4 byte series
- hex 41 42 43 44 45 46 47 48 ; lowercase PETASCII with a space between the arguments for easier reading
- hex c1 c2 c3 c4 c5 c6 c7 c8 ; uppercase PETASCII same thing as above, but it's standard uppercase format
- bin 1010 0000 1011 1111 ; series of 4 bit nybbles with zeroes on the upper nybbles
- scr "abcdefgh" ; VIC-II compatible screencodes
- scr "ABCDEFGH" ; VIC-II compatible screen codes
- and so on. Naturally, good assemblers also let you decide for yourself whether the bytes are being generated in reverse order (so, in effect, you could spell strings backward). Assemblers usually (or at least often) try to avoid byte-framing errors by keeping the object code on an even boundary, unless the programmer actually wanted things to end up on odd boundaries for copy protection purposes. Similarly, some assemblers are designed in such a way that you can generate and deposit object directly in memory, or in specific tracks and sectors. An essential feature for programmers who prefer to write their own Disk Operating Systems. Dexter Nextnumber (talk) 22:29, 9 December 2009 (UTC)
As far as i know, *all* assemblers have directives for entering Hex and other literal values. There is nothing special about what you are describing. Good little assemblers everywhere, let you enter the byte codes "in the same sequence" that you want them to appear in memory, the only issue being endieness. It would be a very unusual assembler indeed that would reverse the order of an arbitrary sequence of bytes for you. Maybe as a Unicode function, but not otherwise, I don't recall ever encountering such a directive. Assemblers only align the bytes on memory boundaries when and where you tell them to by using the appropriate directives they have no crystal ball to allow them to predict where the alignment should be placed. And unless you are dealing with FORTH and writing to BLOCKS which are obsolete, then assemblers do NOT "deposit object directly ... in specific tracks and sectors" assemblers use ordinary files same as any other program and they generate executable machine code not objects. Well, perhaps this is something unique to the VIC20? does the VIC not have a file system? I've never used a VIC, maybe it does use tracks and sectors, some early mainframes did use tracks and sectors because that was before the development of file systems. OldCodger2 (talk) 22:02, 15 September 2012 (UTC)
Vic-20 assembler
[edit]The main page for this article includes a reference to an assembler named "French Silk" and touts it as being the smallest assembler ever written. In fact, the Instant Editor Assembler (known more by its acronym, IEA) was the smallest assembler around. I am not sure who wrote it, but remember it was marketed through Randy Chase's Commodore Users' Group Newsletter. I am not a fan of the IEA, as I was one of many dissatisfied customers who bought a copy of it, and tried to use it. French Silk, on the other hand, had a much better reputation for speed of execution and assembling. Dexter Nextnumber (talk) 08:22, 2 January 2010 (UTC)
- No size given, but it's mentioned here Tedickey (talk) 14:10, 2 January 2010 (UTC)
Automatic optimization
[edit]Some assemblers do limited Peephole optimization, e.g., converting long branches to short branches. I believe that there should be a discussion of this in the article, preferably with some examples. Shmuel (Seymour J.) Metz Username:Chatul (talk) 12:39, 10 August 2010 (UTC)
Neutral nomenclature
[edit]Some terms have different definitions in different assemblers. In particular, some assemblers use the term pseudo-instruction to refer to assembler directives. The article should avoid conveying an impression that the nomenclature used is universal. Shmuel (Seymour J.) Metz Username:Chatul (talk) 13:22, 10 August 2010 (UTC)
- There's no neutrality/POV issue here, so I'm removing the systematic bias template. If reliable sources give distinctly different definitions of a term, and I think this is the case, the article should reflect this. That much is very straight forward. ButOnMethItIs (talk) 08:47, 6 September 2010 (UTC)
- The NPOV issue is the 'removal of neutral language, as discussed in #A pseudo-opcode is a directive. Shmuel (Seymour J.) Metz Username:Chatul (talk) 13:35, 6 September 2010 (UTC)
- I think you misunderstand the meaning of neutrality. Your language may have been more accurate, but it was not more neutral. Again, I see no NPOV issue. ButOnMethItIs (talk) 15:48, 6 September 2010 (UTC)
A pseudo-opcode is a directive
[edit]A pseudo-opcode is a directive, and some assembler use only the term pseudo-op for the functions listed in the article as pertaining to directives. The text
* An '''assembler directive''' or ''pseudo-opcode'' is a command given to an assembler. These directives may do anything from telling the assembler to include other source files, to telling it to allocate memory for constant data. Some assemblers use special syntax for directives; others do not.
should be reinstated Shmuel (Seymour J.) Metz Username:Chatul (talk) 14:57, 22 August 2010 (UTC)
- A pseudo-opcode is not a directive. A pseudo-opcode is a stand-in opcode for another opcode. For example, many older CPUs do not have a nop instruction. But often there is another instruction that can be used instead with the same effect as a nop. In 8086 CPUs the instruction xchg ax,ax was always used for nop. With nop being a pseudo-opcode to encode the instruction xchg ax,ax.
- That's not a psuedo-op, just an alias (sometimes called an extended mnemonic). Shmuel (Seymour J.) Metz Username:Chatul (talk) 17:51, 23 August 2010 (UTC)
- A directive is something that generates no output code but instead directs the assembler to do some internal function.
- So is a pseudo-op. Of course, some pseudo-ops do generate output, e.g., the PUNCH statement in the System/360 assemblers. Shmuel (Seymour J.) Metz Username:Chatul (talk) 17:51, 23 August 2010 (UTC)
- Please don't misrepresent what I wrote. I said "generates no output code" not "generates no output". HumphreyW (talk) 18:00, 23 August 2010 (UTC)
- Fine, then try the DC statement, which can be used to generate output code. And lest you quibble about that not being a directive, the article says For example, directives would be used to reserve storage areas and optionally their initial contents. Shmuel (Seymour J.) Metz Username:Chatul (talk) 11:45, 24 August 2010 (UTC)
- Even the names say exactly what they are, a directive directs the assembler, and a pseudo-opcode is a fake opcode for something else. HumphreyW (talk) 16:52, 22 August 2010 (UTC)
- Agree with HumphreyW. It may be true that some assemblers use the same term for both; nevertheless the concepts are quite different and should be named differently here. Perhaps the article could say something like "some assemblers, including x, y, and z, use the term 'a' for both 'a' and 'b'." Jeh (talk) 07:47, 23 August 2010 (UTC)
- I have already made some changes to the main article. But I think it would not be a good idea to start listing specific assemblers that misuse the term. HumphreyW (talk) 07:53, 23 August 2010 (UTC)
- I'm just thinking that a statement that "some assemblers use one term for both" would really require support from an example. Jeh (talk) 11:30, 23 August 2010 (UTC)
- What is the referent for both? Some assemblers use the term pseudo-op for what the articles calls directive, and AFAIK the tern is older than directive in that context. See, e.g., IBM (April 1964). IBM 7090/7094 Programming Systems FORTRAN II Assembly Program (FAP). C28-6235-3. IBM (December 30, 1966). IBM 7090/7094 IBSYS Operating System Version 13 Macro Assembly Program (MAP) Language. Fifth Edition. C28-6392-4,
- I can provide examples of assemblers that use the term properly. What is your basis for claiming the historical usage to be a misuse? Shmuel (Seymour J.) Metz Username:Chatul (talk) 17:51, 23 August 2010 (UTC)
- The nomenclature should agree with what is actually used by the authors of assemblers, not the local CS department. Shmuel (Seymour J.) Metz Username:Chatul (talk) 17:51, 23 August 2010 (UTC)
- IBM (1961). FORTRAN Assembly Program for the IBM 709/7090. pp. 24–53. J28-6098-1.
In addition to recognizing all the 709 machine operation codes and extended operation codes listed in the 709 Reference Manual, the FAP language also recognizes the following psueod-operations, described in detail in the succeeding chapters.
- psueod-operations [sic] is not a pseudo-opcode. HumphreyW (talk) 18:00, 23 August 2010 (UTC)
- IBM (1961). FORTRAN Assembly Program for the IBM 709/7090. pp. 24–53. J28-6098-1.
- In matters of language, e.g. word usage, I think ancient references are less good than current ones - language, especially technical language, does evolve. Jeh (talk) 19:07, 23 August 2010 (UTC)
- Is April, 2010 too ancient? AIX Version 6.1 Assembler Language Reference, "Pseudo-ops are sometimes called assembler instructions, assembler operators, or assembler directives." Shmuel (Seymour J.) Metz Username:Chatul (talk) 23:39, 23 August 2010 (UTC)
- I see three problems with your reference.
- It says "pseudo-ops" and not "pseudo-opcode".
- I see three problems with your reference.
- Fine, then use pseudo-op in the article. Historically pseudo-op and pseudo-operation are synonymous: IBM (1961). FORTRAN Assembly Program for the IBM 709/7090. pp. 24–53. J28-6098-1. Shmuel (Seymour J.) Metz Username:Chatul (talk) 11:45, 24 August 2010 (UTC)
- It uses weasel words. "sometimes"? When and by whom?
- That would be relevant if I were arguing for the legitimacy of directive; it doesn't use weasel words about the use of pseudo-op. Shmuel (Seymour J.) Metz Username:Chatul (talk) 11:45, 24 August 2010 (UTC)
- It is very vague and seems to cover everything with one catch-all term.
- Historically the term pseudo-op has covered everything that is not a machine instruction or macro invocation. The same is true of the term directive; it's as much of a catch-all term as pseudo-op. Shmuel (Seymour J.) Metz Username:Chatul (talk) 11:45, 24 August 2010 (UTC)
- But more pointedly, do you really think that using that as a reference would really help the encyclopaedia? It looks to me like it creates more confusion rather than actually clarify anything. HumphreyW (talk) 02:06, 24 August 2010 (UTC)
- Have you stopped beating your wife? You wanted a reference as to the legitimacy of the term, and I provided references. The question is whether the reference establishes the usage, not whether it is the best reference to cite in the article. I've established that the usages is decades old and still current. That should be enough to justify restoring the deleted text. Shmuel (Seymour J.) Metz Username:Chatul (talk) 11:45, 24 August 2010 (UTC)
- Reading through this discussion again I see that there is confusion about what is being discussed. I have been careful to always say pseudo-opcode, but I think the point did not get across properly. One commenter talks about pseudo-op, and said "Fine, then use pseudo-op ". Well we can't do that because it is not the same thing. Pseudo-op is ambiguous and can mean other things. Pseudo-operation is also ambiguous and is not used consistently in many cases. Pseudo-opcode has a very clearly defined meaning, and currently the article gives that meaning.
- No, it does not have a clearly defined meaning. Shmuel (Seymour J.) Metz Username:Chatul (talk) 15:22, 7 September 2010 (UTC)
- If one wants to add pseudo-op to that article, then by all means do that. But please do not do it at the expense of removing/replacing the pesudo-opcode description. It will require a different/new description explaining the ambiguity and alternative uses. However I think that adding pseudo-op, and talking at length about various meanings in various places, will not help to make the article any clearer or better. HumphreyW (talk) 14:07, 6 September 2010 (UTC)
- All of the terms are ambiguous. My intent is not to replace one parochial view with another parochial view, but rather to convey the fact that the nomenclature is not standardized; in fact, not even the taxonomy is standardized. The initial issue was the removal of text indicating the variability.
- What I'd like to do is to use neutral descriptions of various categories and cite the various terms that are used for those categories, with references if that's not TMI. Can I do that without the changes being reverted? Shmuel (Seymour J.) Metz Username:Chatul (talk) 15:22, 7 September 2010 (UTC)
- I wonder if it is even worthwhile to have such a lengthy description of taxonomy and/or nomenclature? Since you say it is currently wrong (or misleading, or whetever), then maybe the whole section should actually be axed from the article. But to replace with something even more cumbersome and lengthy seems unhelpful. If it is really as complicated as you say then it sounds like it is just going to confuse readers more than help them. If every term can mean every other term the the terms themselves become useless.
- As for your question "Can I do that without the changes being reverted?" that is unanswerable. There are so many editors here on Wikipedia that no one can speak for all of them. What I would suggest is that you can put you proposal here in the talk page and assuming it is well referenced with relevant sources and no one bitterly complains then it can be copied into the main page later. That will likely give the best chance of not having edits reverted. HumphreyW (talk) 04:41, 8 September 2010 (UTC)
- I wholly support the mention of alternative usage of technical terms as used in reliable sources. I think we should stay away from "x vendor says y" because there's just too much variation across vendors and authors. A short nomenclature section that briefly discusses the lack of an industry standard or clear norm and makes it clear that definitions in this article are not entirely representative sounds very reasonable. It could also be done with footnotes where appropriate, though this sounds more cumbersome. ButOnMethItIs (talk) 05:00, 8 September 2010 (UTC)
- I've added some text to make it more neutral, and have also added a reference to support the usage of pseudo opcode as equivalent to directive. I was going to make it a footnote, but I saw that the article has a long list of references that are simply links. Shmuel (Seymour J.) Metz Username:Chatul (talk) 14:59, 8 September 2010 (UTC)
OK, I am a little late to this one, but it seems to me that different assemblers, or assemblers for different processors, use different names for some terms. A table indicating the meaning, and the different names would be useful. Still remembering from when I was first learning about assemblers, this was what confused me. The descriptions, in at least the IBM manuals, of assemblers mostly describes what the assembler does. You need to find somewhere else the descriptions of machine instructions, and how to use them. Gah4 (talk) 05:25, 29 November 2016 (UTC)
As far as I can tell, IBM OS/360 assemblers call all these assembler instructions, and PDP-10/MACRO-10[1] call them all pseudo-ops.
And VAX/Macro[2] seems to call them assembler directives. Gah4 (talk) 06:31, 29 November 2016 (UTC)
- For the IBM System/360 line, a given assembler will have a programmer's guide[3] and a language reference manual[4], neither of which explains the semantics of machine instructions in detail. The architectural details are in a separate principles of operation manual[5], possibly supplemented by manuals on specific feature. Other vendors do something similar, although some do include hardware details in their assembler documentation. Shmuel (Seymour J.) Metz Username:Chatul (talk) 22:13, 29 November 2016 (UTC)
- Yes, IBM also had both manuals for most compiled languages, keeping the language definition separate from how to use the compiler. Some other companies don't make this distinction.Gah4 (talk) 23:43, 29 November 2016 (UTC)
- Maybe I am the only one to learn OS/360 Assembler only reading IBM reference manuals. It was some time of looking at the manuals, and then understanding the distinction, and Assembler Instruction doesn't make it easier, if you don't yet know the distinction. In the cases I show above, the same wording is used for allocating and initializing data blocks, defining macros, defining entry points and external names, and formatting the output listing. That is, the distinctions that some used above don't seem to exist in the cases shown. Gah4 (talk) 23:43, 29 November 2016 (UTC)
References
- ^ decsystem10 Macro Assembler Reference Manual (PDF). DEC. AA-C780C-TB. Retrieved 29 November 2016.
{{cite book}}
:|website=
ignored (help) - ^ VAX-11_MACRO Language Reference Manual (PDF). DEC. Retrieved 29 November 2016.
{{cite book}}
:|website=
ignored (help) - ^ OS/VS-VM/370 Assembler Programmer's Guide (Fifth ed.). IBM. September 1982. GC33-4021-4.
- ^ OS Assembler Language OS Release 21 (Tenth ed.). IBM. January 1974. GC28-6514-9.
OS/VS-DOS/VSE-VM/370 Assembler Language (Sixth ed.). IBM. March 1979. GC33-4010-5. - ^ IBM System/360 Principles of Operation (EIGHTH ed.). IBM. September 1968. A22-6821-7.
IBM System/370 Principles of operation (Eleventh ed.). IBM. September 1987. GA22-7000-10.
Prevalence of name spaces
[edit]Are there any data to support the claim that most assemblers have symbol management, e.g., name spaces? There are certainly many assemblers that don't. Shmuel (Seymour J.) Metz Username:Chatul (talk) 18:13, 23 August 2010 (UTC)
- I seriously doubt it. I'm sure most assemblers are extremely spartan in their feature set. ButOnMethItIs (talk) 01:00, 7 September 2010 (UTC)
- And for that reason, I have replaced "most" with "some." Although I don't have any reference to support even "some" I assume that the person who put the statement there in the first place would have first-hand experience. Most of the assemblers that I have used have been, as noted above, quite spartan in the features they supported. An old 6502 assembler wouldn't even calculate forward branches for you. Dead Horsey (talk) 06:48, 21 December 2010 (UTC)
Removal of Systemic bias template
[edit]I did in fact initiate a discussion prior to adding a {{Systemic bias}}; the proper course for someone who disagreed would have been to discuss the reasons instead of removing the template. Shmuel (Seymour J.) Metz Username:Chatul (talk) 15:09, 7 September 2010 (UTC)
- There was no on-going discussion or clear claim of bias. If there were, I would have started there. ButOnMethItIs (talk) 15:54, 7 September 2010 (UTC)
- There was, and the template linked to it. Shmuel (Seymour J.) Metz Username:Chatul (talk) 18:00, 7 September 2010 (UTC)
Clarification of extended mnemonics?
[edit]There is a comment suggesting that the following text is unclear
Pseudo-opcodes are often used within the instruction set to support alternative mnemonics for instructions that the CPU designer did not specifically include. For example, many older CPUs do not have a true nop (no operation) instruction. But often there is another instruction that can be used instead with the same effect as a nop. In 8086 CPUs the instruction xchg ax,ax is used for nop. With nop being a pseudo-opcode to encode the instruction xchg ax,ax. Some disassemblers recognize this and will decode the xchg ax,ax instruction as nop.
I agree that it is misleading, and suggest the following
extended mnemonics are often used to support specialized uses of instructions, often for purposes not obvious from the instruction name. For example, many CPU's do not have an explicit NOP instruction, but do have instructions that can be used for the purpose. In 8086 CPUs the instruction xchg ax,ax is used for nop, with nop being a pseudo-opcode to encode the instruction xchg ax,ax. Some disassemblers recognize this and will decode the xchg ax,ax instruction as nop. Similarly, IBM assemblers for System/360 and System/360 use the extended mnemonics NOP and NOPR for BC and BCR with zero masks.
Shmuel (Seymour J.) Metz Username:Chatul (talk) 15:22, 8 September 2010 (UTC)
Early alternatives to assemblers for systems programming
[edit]While it is certainly true that most compilers and operating systems in the 1950s and 1960s were written in assembler, the article overstates the degree of dominance. Notable examples of languages used for system implementation include
Shmuel (Seymour J.) Metz Username:Chatul (talk) 13:12, 18 November 2010 (UTC)
I think that such lists should be avoided, there are too many languages for them to be documented in this fashion. For instance, you somehow missed naming what is arguably the most influental language of all, a language that has probably been used to create more operating systems than any other. That language is 'C' which often has the nickname of Portable Assembly or High Level Assembly. So how many other languages got missed from that list? How about something exotic like FORTH which was an operating system complete unto itself? You have defined a nearly impossible task. If you limit it to generalities, then yes, I agree most operating system development in the later days was done in higer level languages. The earlier it was, the more likely it was to be done in assembly, it was an evolutionary process -- go back far enough and it was done in hand coded binary. OldCodger2 (talk) 07:52, 29 January 2013 (UTC)
- You seem to miss the point here... The languages mentioned were used decades before C became dominant as a systems implementation language in the late 1980's (basically because of it's tight association with UNIX). The early C (or B) of 1972 was a hack used to port an early version of UNIX to the PDP-8. B as well as early versions of C lacked typing and was basically just a simplified and – sadly enough – a syntactically changed variant of Martin Richard's simple but elegant BCPL (which in turn was a partial implementation of the CPL language). A modern C appeared on the public scene around 1978, but it was not until 10 years later that it became really dominant. At that point, languages like Algol68, PL/1, BLISS, JOVIAL, PL/M, Simula, Pascal, Modula, and even Ada, had been used for systems programming for many years, just like C today. — Preceding unsigned comment added by 83.253.229.235 (talk) 01:33, 27 April 2014 (UTC)
Meta: Quotes and italics
[edit]There is a general convention to use italics to represent quoted material that is not surrounded by quotaion marks or apostrophes. User:Nigelj recently added quotation marks around material that already was framed by double apostrophes, the Wiki markup for italics. E.g.,
"floating point partial arctangent"
Shouldn't it be one or the other but not both? Shmuel (Seymour J.) Metz Username:Chatul (talk) 14:48, 19 November 2010 (UTC)
- Yes, I added the double quotes after having added the wikilinks inside the italic phrases.diff I wasn't sure, but I felt that the blue then black type broke up the phrases making it difficult to see what was going on. I was in two minds about the double quotes, but was just trying to make the meaning easy to see. I'm not attached to them and they could be removed without any problem if people think that is best. --Nigelj (talk) 22:58, 19 November 2010 (UTC)
Macros
[edit]The section on macros has a lot of good information, but it seems to get very off-topic. It starts with a good discussion of the use of macros, and then diverges into specific uses of macros in legacy systems, using the macro assembler as a code generator for COBOL, history of the C preprocessor, and then discussions of the underlying structure of Prolog, LISP, and Forth. I'd like to trim this section down dramatically, and move the more advanced content to another article. Any comments? Dead Horsey (talk) 06:55, 21 December 2010 (UTC)
- I agree. It's legitimate to mention that macro assemblers inspired macros in other language, but not to go into detail about macros in languages other than assemblers. Separate articles about macros in document formatting languages, e.g., SCRIPT, general programming languages, e.g., PL/I, and shell script languages, e.g., CLIST, would be helpful. Shmuel (Seymour J.) Metz Username:Chatul (talk) 11:20, 21 December 2010 (UTC)
- I will try to make some time in the next week to make these changes. I need to find out if there are other articles where the content can move. Someone put a lot of effort into writing all that information, and if it can be moved somewhere else that makes sense, I'd rather do that instead of deleting it. Dead Horsey (talk) 03:52, 22 December 2010 (UTC)
- Discussion of using the assembler macro facilities to generate code in other languages definitely belongs in this article. As I noted last year, discussion of features in other languages inspired by the macro languages of assemblers probably does not belong here. Shmuel (Seymour J.) Metz Username:Chatul (talk) 18:47, 11 December 2011 (UTC)
Add references or TMI?
[edit]I corrected the dates for some milestones in the evolution of assembler languages, and in the process I wondered whether it would be appropriate to add references, e.g., 705 Autocoder, 709 FAP, for the approximate dates. Would that be helpful, or would it be TMI? Shmuel (Seymour J.) Metz Username:Chatul (talk) 21:12, 18 January 2011 (UTC)
CPU loading
[edit]I added a citation needed template to the text referring to CPU's being idle. My first take was that while it was not true for mainframes that it was plausible for desktops, but then it occurred to me that the delays in loading the all too common bloated web pages might be due to CPU consumption for rendering. If anybody has hard data on CPU consumption in various applications on various classes of machines, citations would be appreciated. Shmuel (Seymour J.) Metz Username:Chatul (talk) 13:22, 19 January 2011 (UTC)
- Looking at the statement, it seems to me that it means that often the CPU isn't doing useful work. That might be an idle loop on many processors, or a wait state on others. When computers cost millions of dollars, people worked hard to keep them busy. Now they don't work so hard. IBM used to lease machines with a charge based on how much they were actually used, with a meter that ran when it wasn't in a wait state. Gah4 (talk) 06:36, 14 May 2018 (UTC)
- I modified the I/O routines used by the IBM1130 at Victoria University to employ the WAIT op-code rather than just loop-on-busy, with a view to the usage meter not advancing during such WAIT intervals (before the operation-complete interrupt arrived) however, I had not looked closely enough at the behaviour of the meter (like a car's odometer) - it did not stop its advance until nearly a second after the cpu had entered the "wait" state. Even the longest delay, for the page-throw action of the slow lineprinter, completed in less time. Pox. But, it was fascinating to observe the flickering lights showing just how much time was spent waiting during I/O action, and a friend was prompted to devise a buffering scheme employing unused memory that could run the devices at top speed - until memory ran out. NickyMcLean (talk) 11:49, 14 May 2018 (UTC)
Why use assembly language?
[edit]Hi all,
I stumbled across this page via Random Article, it brought back happy memories of writing TSRs and trying to understand PC BIOS routines (ahem). I feel that the first main section (Why use assembly language?) dives a little deep into the nitty-gritty for an opener. But the first para of the Historical perspective section is just what the article needs at the beginning (from a layman's point a view), and I was wondering about incorporating it at the top of the page. MinorProphet (talk) 23:03, 28 April 2011 (UTC)
Why Assembly?
[edit]The "Why Assembly Language" section focussed on the question why to use assembly language rather than to code mahine instruction dirctly. I think it gives a helpful explanation of the functionality provided by an assembler. But no sensible person will ever code machine instructions directly (e.g. with a binary editor), so this alternative is merely academic. or educational, if you like.
In contrast, the question why to write assembly language rather than a high(er) level language is a serious practical question, and I started reading the section with that question in mind. Because this aspect was not covered, I added some text. Because I am not a professional programmer, I don't know to what extent assembly language is still used today. Perhaps others can add some comments to that extent. Anyway, I am old enough to remember the days when even administrative programs were occasionally written in assembly language. Which is not as bad as it may seem if appropriate macro's and subroutines are used. I guess that the introduction oof C in the 1970's has greatly reduced the need to revert to assembly language. Older languages like COBOL, FORTRAN, ALGOL or even Pascal are less suited to write operating system functions, not to speak about the terrible "esperanto" developed by IBM in the 1960's called PL/I. Rbakels (talk) 10:35, 9 December 2011 (UTC)
- Actually, people did program in machines language. The three situations I'm aware of are
- Pedagogical. Force students to write in machine language to give them an understanding of why they will be using assembler for the rest of the semester.
- Binary patches[NB 1]. This was more common when assemblies ran for a long time, but still is done when the source code is not available.
- During the development of a new machine.[NB 2]
- I would argue that PL/I, used in Multics and PRIMOS, is a much better choice than C for writing an operating system. Further, the experience of Burroughs[NB 3] writing MCP suggests that Extended ALGOL[NB 4] is also a better choice. Shmuel (Seymour J.) Metz Username:Chatul (talk) 18:39, 11 December 2011 (UTC)
- Chatul - I'd agree about PL/I, but arguing over choice of a favored language is about as useful as arguing whose wife is better. Peter Flass (talk) 13:37, 16 January 2012 (UTC)
Bytecode assemblers
[edit]Would it be worth making specific mention of bytecode assemblers (e,g, Jasmin)? Or are these subsumed under the notion of a virtual machine architecture? Peter Flass (talk) 13:33, 16 January 2012 (UTC)
Well, if you want to go down that path, I think you would have to start talking about Java, Java Script, PHP, etc, etc, all of which are Byte Code Assemblers for Virtual Machines. And none of those would really be appropriate for this page which focuses on Machine Language as excuted by actual Hardware. Certainly we get into grey areas with things like QEMU or MIX, but they still are dealing with Virtual Hardware. I feel that Byte Code would really be a separate topic. As a further consideration, nobody is expected to write programs in the Byte Code itself, it is an internal language that is not normally exposed. OldCodger2 (talk) 08:25, 29 January 2013 (UTC)
- It seems to me that there isn't anything fundamentally different about Byte Code assemblers. Even the name is somewhat strange, as many hardware architectures are byte based. (See VAX for one example.) At one time, Sun did have hardware that at least partly executed JVM in hardware, and others could still build such hardware. As above, nobody is expected to write programs in JVM code, but pretty much nobody writes assembly code for the RISC processors, either, though someone has to write the templates for compiler code generators. So, it seems to me that JVM code has as much use here as, for example, SPARC assembly code. Gah4 (talk) 03:34, 20 October 2016 (UTC)
- No, Java et al are not assemblers; the source languages are not machine oriented. Jasmin, OTOH, is, although the machine in question is virtual. Shmuel (Seymour J.) Metz Username:Chatul (talk) 15:04, 25 October 2016 (UTC)
Rumors are that Sun at least was working on hardware to execute JVM code, and might have even fabbed a chip. I don't like the name byte code, as all that means is that the opcodes and operand specifiers are in units of bytes, which is true for many hardware architectures. Sun calls (called) it JVM, which is fine with me. But maybe I was mixing assembly code and machine code. For most architectures, there is an assembly language designed along with the hardware. As long as there is only one assembler, or all assemblers accept the same input source, there is no confusion. Normally, one can discuss machine code and assembly code without any confusion. Since Jasmin is, as well as I know, not written by Sun, there can be confusion. But Sun (and now Oracle) supply the javap disassembler, so at least some of the assembler syntax had to be defined. (I don't know how close javap output is to what is needed for Jasmin input.) Even though JVM is mostly emulated, without any dedicated hardware, I don't find the fundamental ideas of machine code (JVM bits) and assembly code (Jasmin input source) fundamentally different from other machine code and assembly code. And yes, I was not trying to confuse Java source and assembler source. Gah4 (talk) 18:57, 25 October 2016 (UTC)
Help request: Decoding starting opcode 69h at ofs 000h in FAT formatted floppy boot sector
[edit]I am currently working on a significant overhaul of the File Allocation Table article for accuracy and completeness, and I have run into a question I was unable to answer myself so far. Perhaps someone of you can help out and provide an answer.
Primer: All FAT formatted volumes since DOS 2.0 contain a BIOS Parameter Block (BPB) in the boot sector describing the volume, wheras DOS 1.x volumes did not. In their course to determine the actual medium format, among other methods MS-DOS, PC DOS and DR-DOS check various byte patterns at offset 000h in a boot sector in order to find out, if a given sector might contain some form of BPB or not. Volumes containing a BPB typically start with a jump instruction at offset 000h to skip over the BPB. Patterns tested for by DOS include a short jump sequence "JMPS ??, NOP" (EBh ??h 90h, as seen since DOS 3.0) or a near jump (E9h ??h ??h, as seen on DOS 2.x formatted disks). On harddisks, DR-DOS (but not MS-DOS/PC DOS) additionally checks for a swapped sequence 90h EBh ??h. (To be precise, these tests alone are not enough to be sure a BPB is present since some DOS 1.1 disks contain EBh ??h 90h as well, but still have no BPB; but I won't go into further details here, as it doesn't matter in regard to my question below.)
On floppies, all these operating systems (MS-DOS, PC DOS and DR-DOS) also check for a byte pattern 69h ??h ??h at offset 000h in the boot sector. This is documented for MS-DOS/PC DOS in at least one book, and it can be found in the OpenDOS source code as well, but without further explanation. Stepping through MS-DOS in a debugger it can be verified that the test actually exists in MS-DOS as well, however, in none of the books I have checked so far, 69h is a valid x86 opcode. So, I wonder what it is. Perhaps a jump instruction for a non-x86 processor? Mind, that the FAT file system was also used on Ataris (Motorola 680x0) as well as on some very late CP/M variants (8080) and all MSX-DOS machines (Z80), but a starting opcode of 69h does not seem to make sense on these platforms as well. The IBM PC RT was built around the IBM ROMP processor, a RISC processor -- certainly not x86-compatible, but unfortunately I cannot find any opcode maps for this specific processor. What about the extra opcodes supported by the NEC V20/V30? Or some undocumented opcode? Was there any other platform important enough to have Microsoft or IBM add this special test into the volume mount code of MS-DOS/PC DOS? Windows NT? Any ideas? --Matthiaspaul (talk) 01:41, 21 January 2012 (UTC)
General cleanup
[edit]I did some more general cleanup on this article, but it is still disorganized and full of redundancies. In particular the sections "current usage" and "typical applications" should be combined, but there are lots of other instances of duplication.Peter Flass (talk) 13:39, 9 March 2012 (UTC)
Two of these things are not like the others...
[edit]Please stop adding these external links:
In the first place, if you look you will notice that the "external links" here are pointers to information, not pointers to specific assemblers - the page List of assemblers is intended for that. In the second place, BBC Basic is not an assembler, even if it contains one. By that criterion, almost all C compilers would be assemblers because they allow imbedded assembly code. Peter Flass (talk) 23:34, 7 June 2012 (UTC)
Locked
[edit]As a new editor I cannot yet edit this article. Is there a way to get it unlocked? I only want to make some minor edits to make it read more smoothly. — Preceding unsigned comment added by Geau (talk • contribs) 12:09, 26 August 2012 (UTC)
- See this: http://en.wikipedia.org/wiki/Wikipedia:Protection_policy#semi HumphreyW (talk) 12:33, 26 August 2012 (UTC)
Thank you. I can see why it was protected, from looking at the history. I suppose I can wait until I am validated. --Geau (talk) 15:41, 26 August 2012 (UTC)
/* Macros */ removed a bogus claim, about built-in operations for gaming and data managment
[edit]Removed: "Many assemblers have built-in (or predefined) macros for system calls and other special code sequences, such as the generation and storage of data realized through advanced bitwise and boolean operations used in gaming, software security, data management, and cryptography."
The above is just total nonsense. First off, hardly any assembler has anything like this. Second off, if it's a macro library, then that has nothing at all to do with the assembler itself. Third, I know of nothing that is specific to "gaming" that is implemented in any cpu architecture unless it is a custom designed cpu. Intel has announced/documented built-in instructions for AES, but guess what, they have yet to ship any actual x86 cpus which implement these instructions -- see Intel's manual for details. By advanced bitwise operations, I guess you mean the usual shifts and rotates? nothing special about that. me? I've programmed so many different computers using so many different assemblers that I've lost count. OldCodger2 (talk) 22:13, 15 September 2012 (UTC)
- I looked at this claim crosswise too. I have a vague recollection that some assemblers may have had predefined macros, maybe SDS Metasymbol for example, but not for anything mentioned. In any case I think it would need a citation. Peter Flass (talk) 23:05, 15 September 2012 (UTC)
- While we're looking at macros, the article says this "Note that this definition of "macro" is slightly different from the use of the term in other contexts, like the C programming language. C macro's created through the #define directive typically are just one line, or a few lines at most. Assembler macro instructions can be lengthy "programs" by themselves, executed by interpretation by the assembler during assembly." While this may be true of C other high-level languages have macro facilities that can be and sometimes are used to write lengthy programs, PL/I, for example. Peter Flass (talk) 23:08, 15 September 2012 (UTC)
- Also, both assembler and PL/I macros have been written with the function of generating text unrelated to the native language, e.g., Stage 1 of the OS/360 systems generation (SysGen) process.Shmuel (Seymour J.) Metz Username:Chatul (talk) 12:00, 19 September 2012 (UTC)
- In the early days it was common for assemblers to have built-in macros but not macro libraries. However, I'm not aware of any that did so in the time-frame of the game consoles; the cases I'm aware of were much earlier. Shmuel (Seymour J.) Metz Username:Chatul (talk) 12:00, 19 September 2012 (UTC)
I disagree with your revert -- Data Sections
[edit]EvilKeyboardCat (talk · contribs) left the following comment on my talk page. I'm moving it here, as it would be better if other editors of this page joined in.
- "I don't agree with either of these changes: 'sections' are not instructions; hardware architecture can easily be a c..."
- I simply provided context and added a link, what is there not to agree with?
- If you don't agree please fix rather than revert.
- I'm an experienced assembly programmer for Hobby operating systems and as far as I'm aware data sections are produced by pseudo-ops, which are instructions.
- Please discuss.
My edit summary was truncated by Twinkle, but I went on to say something like "hardware architecture can easily be a constraint on the use of high-level languages." Taking the two parts of the edit in turn, the edit changed a sentence to begin "Data sections are instructions used to define data elements..." My point is simply that data sections are sections of assembly programs. They contain instructions, and these instructions may well be pseudo-ops, but sections are not instructions. They are sections of the programm, or sections of memory - however you look at it - that contain instructions, or data. I cannot see how more simply to state that. Adding this minor confusion of terminology is not an improvement. The second part of the edit I reverted changed a sentence to specifically say that "constraints or peculiarities in the target operating system's architecture" may prevent the effective use of higher-level languages (The edit also spoiled the number agreement in the sentence by changing prevent to prevents, even though constraints or peculiarities were still plural). My point was that although the operating system may reflect and model the constraints or peculiarities of the underlying hardware, there is no benefit in specifying that it is not the hardware but the OS that has the constraints or peculiarities. Indeed, very often it is the OS that is being written in Assembler, so that the point as changed becomes rather tautologous. As to my admonishment to "please fix rather than revert", that is what I did: the phraseology was fine beforehand, the edit did not improve it, so I fixed it by putting it back exactly as it was, since there was nothing that I can see that was wrong with those passages in the first place. --Nigelj (talk) 16:52, 2 November 2012 (UTC)
Okay, I agree with you on the point that data pseudo-ops are sections not instruction but would it be better for the start of the text on data sections to read "Data section are" not "These are"? Perhaps it could be reworded? Also I think my link to the variables page was justified. EvilKeyboardCat (talk) 00:09, 3 November 2012 (UTC)
Data Sections is still wonky... I think the term is being misapplied. Normally the SECTIONS of a program are used to define the memory space in which it will reside. The typical divisions are between code and data spaces with some operating systems providing Read Only Protection for the code space. Other section properties are for Read Only Data and memory address alignment. Sections can be used to group related parts of data together. What is currently described by this article does not fit any of the above but instead appears to be talking about DIRECTIVES that can be used to define DATA TYPES. A TYPE is not a SECTION, the concepts are completly different. Also DIRECTIVES are sometimes called PSUDO-OPS but they are never (as far as I know) called INSTRUCTIONS. Please clarify your intent... Tautologies aside, how much assembly programming have you people actually done? OldCodger2 (talk) 00:23, 13 December 2012 (UTC)
- I regularly work on a 16-bit x86 hobby operating system and the operating system kernel is written in assembly, which I have worked on many times. The operating system compiles using NASM and section 3.2 of the NASM manual references data and macro statements as Pseudo-instructions. This is what I though they were called, i.e. there are real and pseudo instructions. Here is a quote from the manual: Pseudo-instructions are things which, though not real x86 machine instructions, are used in the instruction field anyway because that's the most convenient place to put them. The current pseudo-instructions are DB, DW, DD,DQ, DT, DO and DY; their uninitialized counterparts RESB, RESW, RESD, RESQ, REST, RESO and RESY; the INCBIN command, the EQU command, and the TIMES prefix.. NASM calls them pseudo instruction but different assemblers (like FASM or MASM) may call them by different names. We should put the most use name first then a (also known as foobar) afterwards. EvilKeyboardCat (talk) 08:46, 13 December 2012 (UTC)
- Nomenclature is not consistent between systems, and sometimes not even consistent between assemblers on the same system. My background is heavilly S/360, through z, although I've been exposed to a dozen or so assemblers on other systems.
- A section is a block of storage[a]that a loader can treat as a unit; it can contain code, data or a mixture, and can be r/o, writable or copy on write. A section usually has a name, although do to FORTRAN there is supprot for unnamed common sections. The details depend on the program structure of the OS[b], not on the hardware architecture.
- I wouldn't mention this had you not asked, but I've been writing in assembly languages since 1960; I feel confident that there are others here of the same vintage. Shmuel (Seymour J.) Metz Username:Chatul (talk) 17:26, 18 December 2012 (UTC)
Notes
I still disagree with the wording of sections. Perhaps it's because my experience has mostly all been with microprocessors and yours is mostly mainframes??? If you look at the NASM manual and how it describes sections for x86... which is also very similar to how sections are used for z80 and other processors I've programmed. Here is what the manual has to say. And I cannot reconcile what the manual says with what this article trys to say.... OldCodger2 (talk) 10:45, 29 January 2013 (UTC)
6.3 SECTION or SEGMENT: Changing and Defining Sections
The SECTION directive (SEGMENT is an exactly equivalent synonym) changes which section of the output file the code you write will be assembled into.
In some object file formats, the number and names of sections are fixed; in others, the user may make up as many as they wish. Hence SECTION may
sometimes give an error message, or may define a new section, if you try to switch to a section that does not (yet) exist.
The Unix object formats, and the bin object format (but see section 7.1.3, all support the standardized section names .text, .data and .bss
for the code, data and uninitialized-data sections.
- A few points.
- The nomenclature among assemblers is inconsistent, and any of the terms directive, pseudo-instruction or pseudo-operation may be used with the same meaning. As noted, a section is not a pseudo-op, but the selection of what code goes into what section is controlled by pseudo-ops.
- The details of defining sections have almost nothing to do with the hardware's architecture and are strongly tied to the operating system. Historically sections were influenced by the need to support named common in FORTRAN. The NASM paragraph 6.3 quote is fully consistent with that. Shmuel (Seymour J.) Metz Username:Chatul (talk) 22:58, 31 January 2013 (UTC)
Don't like the example
[edit]marked as resolved by Diamondl (talk) 03:25, 18 January 2018 (UTC)
I really don't like the section called "Example listing of assembly language source code" It provides no useful information and is in no way informative. Maybe a section of code from a real micro(processor/controller) would be useful but a snippet of code from a virtual device is useless. — Preceding unsigned comment added by Mtpaley (talk • contribs) 22:57, 20 December 2012 (UTC)
I agree, the example fails to convey anything meaningful, it appears to just be a bunch of random additions without any apparent purpose. Would it be okay with people if I were to replace it with this code instead? (taken from an actual program) OldCodger2 (talk) 09:41, 29 January 2013 (UTC)
Example: x86 32 bit NASM, Note: this is a subroutine not a complete program.
178 ;ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
179 ;
180 ; counts a zero terminated ASCII string to determine it's size
181 ; in: eax = zstr.addr
182 ; out: ecx = zstr.count
183
184 zstr_count: ; entry point
185
186 00000030 B9FFFFFFFF mov ecx, -1 ; init the loop counter, pre-decrement to compensate for increment
187
188 .loop:
189 00000035 41 inc ecx ; add 1 to the loop counter
190
191 00000036 803C0800 cmp BYTE [eax + ecx], 0 ; compare the value at the base memory address + the loop offset to zero
192 0000003A 75F9 jne .loop ; if the memory value is Not Equal to Zero then jump to the label called '.loop'
193
194 .done:
195 ; we don't do a final increment because, even though the count is base 1,
196 ; we do not include the zero terminator in the count
197 0000003C C3 ret ; return to the calling program
198
199
200
- I agree, and I note that one difference is that the suggested example specifies the architecture and the assembler for the listing. However, it might be better to create a separate article for assembler examples and to include examples for disparate assemblers, including IBM's HLASM and an assembler for at least one RISC. Shmuel (Seymour J.) Metz Username:Chatul (talk) 21:57, 31 January 2013 (UTC)
okay, thanks for the feedback, glad you agree, I have gone ahead and updated the article. Hopefully this won't lead to any disagreements. I will leave the creation of a separate page with lots of assembly examples for a different day. OldCodger2 (talk) 18:58, 5 February 2013 (UTC)
Link to assembler disambiguation page?
[edit]The last update changed the bare word assembler to a link. However, assembler is just a disambiguation page rather than an explanation of assembler programs. Is such a link appropriate? Shmuel (Seymour J.) Metz Username:Chatul (talk) 22:22, 7 February 2013 (UTC)
- In general, you are not supposed to link to disambiguation pages. In some cases, though, it might be interesting to see other uses for a word. More specifically, what is an assembler assembling? Is that meaning related to other uses for the words assembler and assembly? (And, for that matter, should it be assembly language or assembler language?) Gah4 (talk) 23:11, 29 November 2016 (UTC)
Assemblers scheduling instructions?
[edit]"Modern assemblers, especially for RISC architectures, such as SPARC or Power Architecture, as well as x86 and x86-64, optimize instruction scheduling to exploit the CPU pipeline efficiently." - What? There is no citation but this suggests there exists an assembler which can do this. The closest thing I can think of would be MIPS branch delay slots, where assemblers exist (such as GNU gas) that can fill the slot with an instruction. 24.85.180.193 (talk) 04:27, 11 January 2014 (UTC)
- OK, how about for IA-64 (Itanium) where a 128 bit word contains three instructions, controlling different functional units at the same time. Hopefully the assemblers provide some help on organizing instructions such that things happen at the right time. Gah4 (talk) 05:44, 29 November 2016 (UTC)
Compiler optimization versus hand optimization
[edit]One fact to to keep in mind is that code rearrangements that improve performance on one processor may degrade it on another. A compiler (including an assembler) that has an option to do optimization for a specific processor may give result better than an assembler program hand optimized for one processor but run on another. Of course, configuration management will be more complicated if you have to distribute more versions of the object code. Shmuel (Seymour J.) Metz Username:Chatul (talk) 18:40, 22 January 2014 (UTC)
User asm
[edit]Hi, I've suggested to rename seven User asm categories to User ASM. The lower case names would conflict with the Ethnologue/IANA/ISO asm language code as used in the {{#babel:…|asm-?|…}}
magic. The existing {{User Assembly Language}}
templates -0…-5 and -N won't be affected. –Be..anyone (talk) 05:48, 15 February 2014 (UTC)
ERRATA sheet, what?
[edit]I consider myself an assembler expert, and I've never come accross an erata sheet, much less one that is treated by a linker. The suggestion ensues that one pass assemblers were the norm in primitive system, while in fact two pass assemblers were, and passing a long source on paper tape was twice done on e.g. Intel's first development system for the 8080. Later came the Isis system featuring double 8 inch floppies, and undoubtedly the assembler was two pass, but who to prove it. 80.100.243.19 (talk) 04:21, 19 February 2014 (UTC)
variable instruction length
[edit]The authors are muddying the waters by considering early on in the discussion of the number of passes the possibility that the assembler decides whether a jump could be long or short. It is quite common that the assembler forces or allows the programmer to specify the length. Under that assumption the one versus two pass can be discussed clearly. It must also be pointed out that in that case more than two passes are never applicable.
- Address fix-ups and exploitation of short jumps are not the only reasons for multiple passes, especially in the early days. Shmuel (Seymour J.) Metz Username:Chatul (talk) 20:10, 26 February 2014 (UTC)
Sample code missing language
[edit]User:Ankitapasricha (no talk page) added a sample code section, but failed to mention the language. Any sample code in an article covering multiple languages should indicate the specific language, and, in this case, also the hardware platform. If anybody knows the processor, OS and assembler for the sample code, please update Assembly language#Sample Code to reflect that.
Shmuel (Seymour J.) Metz Username:Chatul (talk) 17:06, 2 October 2014 (UTC)
- Do we even need this section? It's nice-looking code and all, but what point is being made by this? Even if the language is given, do we expect readers to decode the instructions to follow the algorithm? If the point is just to give a "feel" for assembly, we have an image of some assembly code already included at the start of the article. --A D Monroe III (talk) 17:46, 16 October 2014 (UTC)
- I could make a case for having examples, but I could also make a case for not having them, at least in the same article. My major concern is that if there are to be examples that they should be properly identified. I certainly have no objections should you decide to remove the unattributed example. Shmuel (Seymour J.) Metz Username:Chatul (talk) 17:01, 24 October 2014 (UTC)
Errata assemblers, huh?
[edit]I count myself as an assembler expert, and even have written a few. I never have come accross an errata assembler as described here, let alone as a class in its own standing apart from multi pass assemblers. Is this a crippled way to describe assemblers that give relocatable output? But in that case the term one pass is misleading, as relocatable output must be processed by a linker before it can be executed.
I'm so non plussed that I hesitate to try clear things up. — Preceding unsigned comment added by 80.100.243.19 (talk) 14:09, 2 December 2014 (UTC)
- I've read about such one-pass assemblers, although I've never encountered one. But then I only started in 1960, and there were a lot of assemblers out by then. Shmuel (Seymour J.) Metz Username:Chatul (talk) 20:17, 2 December 2014 (UTC)
- I remember stories about Soviet processors, where each came with a list indicating which instructions didn't work. (Similar to a bad-block map for older disk drives.) I suspect that having assemblers and compilers that could work with such lists would be useful. I don't know if that is related to errata assemblers, though. Gah4 (talk) 22:59, 29 November 2016 (UTC)
reverted edit - opinions?
[edit]One of my recent edits was reverted:
While the original version may have been more concise I felt that it was incomplete. I'd appreciate it if someone could provide another opinion. Peter Flass (talk) 22:45, 12 February 2016 (UTC)
Typical applications - reverse engineering
[edit]Reverse engineering of software/firmware can be used for many reasons, from the most worthy to the most unworthy:
- To address a flaw in a product that is no longer supported by the company who designed it, either because that company is defunct or because they moved on to fancier products...
- for example, the Turbo-C 1.0 compiler has been release into the public domain, but if you try to use a timing function on a modern computer (400Mhz+), it won't work. That particular library function would have to be rewritten for fast computers.
- For educational purpose. Figuring out who things are done.
- for example, you can disassemble any product to learn how good (or bad) algorithms are written.
- For making a competitive product, changing just enough to circumvent a patent.
- for finding the security holes in a product in order to created viruses or other malware.
I wonder if the sentence that was added today in the article, belongs here or better under reverse engineering. Dhrm77 (talk) 13:26, 22 June 2016 (UTC)
Machine Language Assemblers
[edit]An assembler for assembling machine language uses Mnemonics to represent the binary codes of machine language. So it is not technically a separate language but an easier to remember alphabet for typing that machine language. Very high level macro-assemblers create what may look like an addition to the language giving rise to the idea that a new and separate language has been created but this article is about Machine Language Assemblers and should avoid such confusion IMHO Scottprovost (talk) 02:59, 14 November 2016 (UTC) Scottprovost (talk) 02:55, 14 November 2016 (UTC)
- You are forgetting (or perhaps you were never aware of) the fact that the vast majority of assemblers, even non-macro assemblers, support symbolic names for memory locations - both for code (branch points) and for data. The assembler must assign numeric locations for each symbol and include the appropriate address in the generated binary code for each of the instructions that reference such locations. So it is not just a matter of changing mnemonic opcodes, etc., into the machine language. Plus, in most modern environments, the output of the assembler is not yet executable; it must be processed by a "linker"; some of the assembler language constructs are interpreted by the linker, or are instructions to it. Jeh (talk) 19:53, 14 November 2016 (UTC)
- Usually there is a link step, but not always. For IBM S/360, there is a three card loader that will load an object program, as generated by the assembler, into core and start executing it. While the usual way to use an assembler is to generate relocatable object code, many have the ability to generate absolute addresses. For the IBM 360/20, the usual way to load programs is with a one card loader. (There is no OS, you just load and run your program.) Other systems have a similar ability to load and run programs. Gah4 (talk) 22:57, 29 November 2016 (UTC)
Machine Language Assembly Code
[edit]Since the assembly represents actual executable content and does not require decoding or compiling to run. Assembler code is not "Source" code, it is simply code. More accurately, it is object code. This machine code may be loaded into memory and called with no need for compiling as long as it is in binary format. The term source code is improper but acceptable in a non academic discussion. Whether it is appropriate in a Wikipedia Article I will leave up to others. Scottprovost (talk) 03:11, 14 November 2016 (UTC) Scottprovost (talk) 03:18, 14 November 2016 (UTC)
- Incorrect. Assembly code is not the same as the machine code (what you call "binary format"), does not usually express the same thing as machine code, and is not necessarily invertibly mappable to machine code. Assembly code must be translated to the actual machine code. There is not necessarily a one-to-one correspondence between the two - for just one of the reasons, see the above comment regarding symbolic names for memory locations.
- Anyway this opinion of yours is moot as far as Wikipedia is concerned unless you have references for recognized authorities in the field expressing it. (I seriously doubt your idea is common in academia.) Do you have references? Jeh (talk) 19:45, 14 November 2016 (UTC)
- Not only is he wrong, but it's possible to write useful assembly code that generates no machine code, e.g., OS/360 SysGen Stage 1. Shmuel (Seymour J.) Metz Username:Chatul (talk) 21:19, 18 November 2016 (UTC)
- Just to be sure, some call it Assembly Source. Yes it is source code, much more readable than object code. As for OS/360 sysgen, I would not call that assembly code, but use of the assembler for something else. But one can assemble a table of address constants, which aren't machine code, but are referenced by machine code, or even non-address constants. (The initialization for a Fortran COMMON block is all data, no instructions.) Gah4 (talk) 01:35, 19 November 2016 (UTC)
Not one to one
[edit]Assemblers are, in general, not one-to-one. They frequently have multiple mnemonics for the same opcode, and may perform optimizations, e.g., selecting near branches versus far branches. Then there are statements like EQU that do not generate code at all. Shmuel (Seymour J.) Metz Username:Chatul (talk) 18:22, 26 June 2017 (UTC)
- OK, for the actual quote: Unlike high-level languages, there is usually a one-to-one correspondence between simple assembly statements and machine language instructions. I do agree that it could use fixing, though I am not so sure about your explanations. I read it as one line of assembler source generates one hardware machine instruction. You do have to separate what IBM calls assembler instructions (which complicates the quoted part) and others call pseudo-ops, from machine instructions. I don't read it to disallow that different assembler opcodes can generate the same machine opcode, or vice versa. Just that, as written, it is one of each. But okay, CNOP can generate more than one NOPR. But then again, it says usually. Another complication in counting is prefixes, such as in x86. They might be written on separate lines, or on the same line (an actual prefix) as the instruction they apply to, or even part of an address specifier (segment overrides). Some assemblers might fill a branch delay slot with a NOP. I suspect that there are some other cases, in other assemblers, but rare enough to satisfy usually. And, of course, there are macros which, sort of by definition, often generate more than one instruction.
- I agree that most assemblers support a few non-one-to-one operations, but the main distinction of assemblers from compilers is that assemblers focus on one-to-one operations, while compilers have little or no concept of one-to-one support. The "usually" statement emphasizes that distinction, as we should. So, "one to one" must stay. Is there some suggestion for tweaking the wording? --A D Monroe III (talk) 23:53, 26 June 2017 (UTC)
- As I suggested above, there is the complication that IBM calls the assembler source for executable instructions machine instructions, which is confusing, as the quoted statement calls the assembler output machine instructions. Statements that don't generate anything, but tell the assembler something are called assembler instructions. The quoted statement could have some way to make the distinction between such statements. Gah4 (talk) 02:38, 27 June 2017 (UTC)
- Why? Of course there are details of reality that are more complicated than a single sentence can ever completely encompass. But the quoted statement is not attempting to completely encompass everything; it's only stating the main distinction between assemblers and compilers. I see nothing brought up here that affects the stated distinction, or implies the "usually" qualification insufficient or incorrect. Again, the wording might be improved, but we need to keep its substance and simplicity. --A D Monroe III (talk) 19:49, 27 June 2017 (UTC)
- Yes. If someone happens to have better wording, it would be interesting to see. No complaints about the usually, but about the meaning of machine instruction, which can mean different things to different people. Gah4 (talk) 22:40, 27 June 2017 (UTC)
Potential Citation/Reference Issue
[edit]I am not sure if you had intended to link to another article but currently the citation: Booth, A.D.; Britten, K.H.V. (September 1947). "Coding for the ARC" (PDF). Birkbeck College, London. Retrieved 23 July 2017.
Points to the following document:
Booth, A.D.; Britten, K.H.V. (August 1947). "General Considerations In The Design Of An All Purpose Electronic Digital Computer" (PDF). [Source/Date TBD]
I did search for another PDF to what I think you intended, but could not readily find a PDF. I did include the following link as it might be helpful in the search for the actual content: http://hopl.info/showlanguage.prx?exp=4929
Thoughts? — Preceding unsigned comment added by Illusive.D0ct0r (talk • contribs) 13:25, 7 December 2017 (UTC)
someone asked: are there programmable devices, that have assemblers, that aren't computers?
[edit]are there programmable devices, that have assemblers, that aren't computers? It depends. There are things that aren't computers in the way most people think about them, but then again, you might define anything programmable as a computer. How about programmable hand (or desk) calculators, that are most often not considered computers, but might do many things that one would commonly do on a computer. Assemblers are commonly used for microcoding, that is , for the internal control code of a computer. There are assemblers written specifically for finite state machines that are part of other systems, but not general enough to be computers. One of the less obvious ones, is that IBM OS/360, and likely other IBM systems, use the assembler to generate JCL for doing system generation. The assembler macro facility is general enough to do things other than generate machine code. Gah4 (talk) 04:44, 14 May 2018 (UTC)
- DOS/360 and TOS/360 certainly did the same thing for stage 1 of a SysGen. Shmuel (Seymour J.) Metz Username:Chatul (talk) 13:09, 14 May 2018 (UTC)
Overly broad claim.
[edit]Assembly language has the statement "Despite the power of macro processing, it fell into disuse in many high level languages (major exceptions being C, C++ and PL/I) while remaining a perennial for assemblers." The preprocessor facilities of C and C++ are not particularly powerful, and can't even do simple computations.
I added the footnote "Of those listed, only the PL/I macro facility is Turing Complete" and user:Wtshymanski reverted it with no explanation. I'm throwing this open to discussion before I reinstate my correction. Shmuel (Seymour J.) Metz Username:Chatul (talk) 17:08, 24 May 2018 (UTC)
- This is an article about assembly language, not macro preprocessors. It is hugely common in C, even today, and even though it's not powerful. The point here isn't whether macro languages are powerful in particular languages, it's about whether they're used ot not. Andy Dingley (talk) 17:20, 24 May 2018 (UTC)
- That's a good reason to remove the references to C and C++ entirely. It's not a good reason to make misleading claims about them, or to remove clarifications of those claims. And what is the it that's hugely common in C? Certainly not the types of mac4ros that have been common in assembler code for the last half century. Shmuel (Seymour J.) Metz Username:Chatul (talk)
- Does anyone but comp-sci profs actually care if thus-and-so is Turing complete? And more importantly, is it necessary to send the reader of this article off on a wild goose chase to learn what "Turing complete" means? --Wtshymanski (talk) 19:34, 24 May 2018 (UTC)
- Is it necessary to give the reader a misleading reference to C and C++? If you don't want the reference to those languages to be clarified, then remove them entirely, not just the clarification. Shmuel (Seymour J.) Metz Username:Chatul (talk) 15:02, 25 May 2018 (UTC)
- I was going to say that it belongs in an article on assemblers, and not on the language that they process, but it seems that this is also, with a redirect, the article on assemblers. There are articles on specific assemblers, but not on the general idea of an assembler program. One solution is to actually split this up, with the more theoretical parts here, and more practical parts, such as the need for multiple passes, in another article. Since macros are an integral part of many assemblers, discussing them doesn't seem so far off. Gah4 (talk) 20:37, 24 May 2018 (UTC)
- Assembly language is the article on the general idea of an assembly program; the presence of examples doesn't change that. If you know of something that makes a claim specific to a particular assembler, please correct it or point it out here. BTW, do you know of an inappropriate redirect to Assembly language? Shmuel (Seymour J.) Metz Username:Chatul (talk) 15:02, 25 May 2018 (UTC)
- I was thinking along the lines of the difference between IBM's Language Reference and Programmer's Guide, or more generally, between theory and practice. As an example, which may not actually apply here, it is usual to use a hash table in an assembler to keep track of symbols. That is an implementation detail unrelated to the actual language. More obviously, as I noted above, the use of multiple passes. I do believe that it is possible to separate the language from the programs that process it, and still independent of any specific assembler. Gah4 (talk) 18:28, 25 May 2018 (UTC)
- Does wiki currently have articles on compiler implementations[a]? If so, one on assembler implementations might be a good idea if someone is willing to do the work. Such an article could go into the tradeoffs among, e.g., a hash table, a B-tree, sorting the symbol table. Shmuel (Seymour J.) Metz Username:Chatul (talk) 17:28, 27 May 2018 (UTC)
- ^ Programming language implementation is not that article
To add a paragraph on the origin (the first machine assmenlr was made for, date and authors,) reason ir was naed assembler, etc.
[edit]I want to add a short description of the origins of assembler. (a) the first digital computer an assembler was written for, what it (assembler) was like (b) it's author (c) why it is named assembler/assembly language? by whom and when? (d) did it have another name originally?
Unfortunately, The sources I found are somehow contradictory. A number of different views can be found at https://www.quora.com/Why-is-the-assembly-language-called-so Particularly, it has the following statements: 1. The inventor actually called what we now call an assembler (which converts the mnemonics of Assembly into machine code) a “converter.” 2. the programmer would convert each symbolic instruction to its binary equivalent, which became known as “assembling” the program. It wasn’t long before someone wrote a program to do the job, and naturally named it the “assembler.” In a backward way, the symbolic instructions became known as “assembler” or “assembly” code. The most reliable source I found is IEEE computer society article David Wheeler 1985 Computer Pioneer Award "For assembly language programming". https://www.computer.org/web/awards/pioneer-david-wheeler Wheeler's "initial orders" allowed Edsac instructions to be provided in a simple language rather than by writing binary numbers, and made it possible for non-specialists to begin to write programs. This was the first "assembly language" and was the direct precursor of every modern programming language, all of which derive from the desire to let programmers write instructions in a legible form that can then be translated into computer-readable binary. --P.maistrenko (talk) 14:26, 4 October 2018 (UTC)
"First appeared 1949" -- did it?
[edit]The infobox claims that "assembly language" first appeared in 1949, and the article also has a number of categories relating to that year. However, I can't find any text or citations in the article justifying that date, and in fact the section on "Historical perspective" claims that "The first assembly language was developed in 1947 by Kathleen Booth for the ARC2". Seems like an error.
- I suspect that, as usual, the exact date depends on the exact definition. One important property of assemblers now is symbolic addressing. It might be that early assemblers (random guess) allowed symbolic opcodes, but not symbolic addressing. You then have to decide if that counts or not. I suspect it took some years to get to the modern definition. Gah4 (talk) 20:17, 20 April 2020 (UTC)
Kathleen B, 1947, Assembly language
[edit]The second paragraph of the paper written by the Booths begins:
- "The non-original ideas, contained in the following text, have been derived from a number of sources, ... It is felt, however, that acknowledgement should be made to Prof. John von Neumann and to Dr. Herman Goldstein for many fruitful discussions ..."
Kathleen Booth's 1947 contribution to the field began with a 1946 trip by her future husband, Andrew Booth, to the USA, spending time at Princeton, gaining insight into the field from Prof. von Neumann. The following year they both came, for a longer (6 month) visit.
Their 1947 paper envisioned parallel arithmetic units and/or a large memory. Each parallel unit would possibly do 100 calculations per second, and a large memory would be 1,000 to 10,000 "numbers" of "approximately 40 binary digits." After talking theoretics, they labeled this "quite impracticable."
Over a decade later the word sizes of 36 bits (e.g. IBM 7094) were the high end, and while 32K did exist by that time, earlier machines were typically in the 4K range.
In short, since programming in those days meant flipping switches, Kathleen Booth's work was not the writing of an assembler, with or without a symbol table. It was really about not having to flip switches over and over, but rather recording binary values on paper tape.
One someone or several someones, together or at similar times but isolated from one another, developed assembly language. The "is credited" wording seems short and to the point. The Von Neumann reference isn't really necessary - it would require splitting all of this into sections to cover
- Andrew Booth's 1946 trip
- The six month follow up trip by both Kathleen and Andrew
- Kathleen's flipping of switches - early day "programming"
- Their review of possible memory types, including paper tape, magnetic tape, magnetic drum
- Her theoretical work, which didn't product an actual assembler program, since symbolic characters were not even part of the system - there were no character strings.
It should be understood that even Project Whirlwind didn't meet all of the goals envisioned by the Booths. To recap, even "is credited" may be an overstatement, but to say that she actually wrote an assembler, on a machine that didn't deal in character data, is untrue. Pi314m (talk) 07:34, 10 February 2019 (UTC)
Misplaced text
[edit]The last three paragraphs of Assembly language#Assembly directives have nothing to do with Assembly directives
Symbolic assemblers let programmers associate arbitrary names (labels or symbols) with memory locations and various constants. Usually, every constant and variable is given a name so instructions can reference those locations by name, thus promoting self-documenting code. In executable code, the name of each subroutine is associated with its entry point, so any calls to a subroutine can use its name. Inside subroutines, GOTO destinations are given labels. Some assemblers support local symbols which are lexically distinct from normal symbols (e.g., the use of "10$" as a GOTO destination).
Some assemblers, such as NASM, provide flexible symbol management, letting programmers manage different namespaces, automatically calculate offsets within data structures, and assign labels that refer to literal values or the result of simple computations performed by the assembler. Labels can also be used to initialize constants and variables with relocatable addresses.
Assembly languages, like most other computer languages, allow comments to be added to program source code that will be ignored during assembly. Judicious commenting is essential in assembly language programs, as the meaning and purpose of a sequence of binary machine instructions can be difficult to determine. The "raw" (uncommented) assembly language generated by compilers or disassemblers is quite difficult to read when changes must be made.
I'd probably move the text to new subsections of Assembly language#Key concepts or Assembly language#Language design.
Also, some compilers generate assembly language with comments or pseuodo-assembly listing with comments, e.g., many of IBM's PL/I compilers. Shmuel (Seymour J.) Metz Username:Chatul (talk) 18:10, 14 April 2019 (UTC)
one pass
[edit]There is recent discussion in edit summaries on one-pass assembly. Many (most? all?) assemblers allow forward references, such that they might not be able to completely assemble something on the first pass. It is, then, usual for a first pass that determines the address of each item (instruction or data), and then on the second pass, knowing all addresses, generate actual output. However, the object format used for OS/360, and possibly for others, has an address on each output card (that is, 80 byte record), such that output does not have to be in sequential address order. That makes it easier to write a one-pass assembler. On the other hand, you can write things where the length is determined by a symbolic name. I am not sure what they do about that. Note, though, that the out of order object code just moves the problem to the linker. The OS/360 linkage editor is famous for being slower than compilers. (An original design goal, and the reason for the name linkage editor, is to reduce the need for complete recompilation.) A multi-pass assembler that reads the input card deck more than once is pretty inconvenient. Later, they just use temporary disk files. Early assemblers had to run in small memory, though so did the linkers. Gah4 (talk) 21:05, 20 April 2020 (UTC)
- A couple of notes:
- On the IBM 650 the Symbolic Optimal Assembly Program (SOAP) assigned addresses to symbols at the time of first use in a fashion intended to reduce rotational delay.
- On the S/360, everything except BPS/360 required tape or disk; the compilers had no need to read the physical cards once per pass. In particular, on OS/360 both Assembler (E) and Assembler (F) used work files. There was no single pass assembler for DOS or OS. Shmuel (Seymour J.) Metz Username:Chatul (talk) 00:49, 21 April 2020 (UTC)
- I don't know if any are still around, but there was SPASM, a Single Pass assembler for OS/360, referenced here. I don't remember ever using it, though. Gah4 (talk) 01:12, 21 April 2020 (UTC)
two-pass assemblers
[edit]Some editorial comment indicated that it was unclear that the listing and code-generation would occur during the second pass. That's expected, since forward-references to symbols meant that the required values wouldn't be available until the whole source was read/parsed. And yes, reading a card deck or paper tape more than once was a little inconvenient TEDickey (talk) 23:16, 20 April 2020 (UTC)
Historical perspective - Atari St - Commodore Amiga
[edit]C was more popular than assembly for these 68000 based home computer systems. Rcgldr (talk) 14:34, 11 September 2020 (UTC)
- Most of games and demos were written in Assembly but strategy/RPG/adventure games (ported from PC & Mac) and OSs and utilities were written in C. So it's not easy to say which language was more popular, both were very important. Hobbyists also used a lot of BASIC (and STOS/AMOS) --84.248.217.94 (talk) 10:22, 3 August 2021 (UTC)
Current usage - IBM mainframes
[edit]There is a lot of IBM 360 family legacy code that is a mix of Cobol and assembly. The historical reason is that the database access methods, such as ISAM (Indexed Sequential Access Method) (it might have been BDAM?) were implemented as assembly based macros, and as long as some assembly was needed, some optimized code was also implemented in assembly. For current usage, few companies would be willing to take the risk or time it would take to port huge libraries of working assembly code to higher level languages. Rcgldr (talk) 14:38, 11 September 2020 (UTC)
- COBOL compilers in both DOS and OS supported ISAM. Shmuel (Seymour J.) Metz Username:Chatul (talk) 19:26, 11 September 2020 (UTC)
- As posted below, maybe it was BDAM or some feature for ISAM. The key point is that there is still a legacy mix of Cobol and assembler for IBM mainframes. [IBM Cobol and assembly] Rcgldr (talk) 22:22, 11 September 2020 (UTC)
- I suspect that ISAM was designed for COBOL. I believe PL/I supports it, but then it seems to be designed to support many COBOL features. It might still be that some called assembly routines to do I/O, or other operations. Gah4 (talk) 19:40, 11 September 2020 (UTC)
Maybe BDAM. I used ISAM in PL/I and COBOL beginning in 1970 and never needed assembler for any of it. I don’t think HLLs supported all DAM options. I finally realized you could process all members of a BPAM dataset in PL/I (and presumably COBOL) with only an assembler routine to put the member names in the JFCB. Peter Flass (talk) 21:20, 11 September 2020 (UTC)
- I suppose that there might be some feature of ISAM that wasn't available in COBOL, and might need a special routine. This article is, at least somewhat, supposed to be system independent. I do remember with the PDP-10/Fortran-10, which does know about direct access files, but I also needed record locking. Someone wrote two Macro-10 programs, just a few instructions each, which did that. That is, more than one copy of the program could run, and access the file, at the same time. I suspect that there are enough times when just a little assembly program can make things easier, but also complicate porting when needed later. My favorite for IA32 is a two instruction program to RDTSC (read time stamp counter) and return. Nice for accurate timing to find bottlenecks. So, maybe the article can say something general about the use of small assembly programs called from high-level languages. In all my years of assembly programming, just about all is routines called from high-level languages. Can we write something general about that? Gah4 (talk) 22:09, 11 September 2020 (UTC)
-
- "system independent" - the issue here is current usage of assembler by mainframes is mostly due to IBM mainframe legacy code, still very popular, as it is used by banks, insurance companies, government, ... .Rcgldr (talk) 01:45, 12 September 2020 (UTC)
@Chatul: @Peter Flass: Assembly macros are still in use. IBM's migration to the current version of Cobol includes the changes needed for the assembly code, but doesn't mention porting that assmembly code into Cobol, so apparently some aspects of the access methods still require assembly macros: DFSMS macro instructions for data sets pdf Rcgldr (talk) 07:54, 12 September 2020 (UTC)
every assembly language is designed for exactly one specific computer architecture
[edit]The article says: every assembly language is designed for exactly one specific computer architecture. While this should mostly be true, it doesn't seem quite so obvious. For one, when an architecture is extended, consider S/360, S/370, XA/370, ESA/370, ESA/390, z/, the new assembler is usually backwards compatible. (As long as you don't use new instructions.) Also, pretty often the first try for new instructions is done with macros in the old assembler. That won't work for new addressing modes, though. But also it depends on what you mean by assembly language. If it means other than the specific machine instructions, then some assemblers for 8 bit microprocessors could be used for more than one. A look-up table for the machine instructions was used, where the assembler instructions (see discussion above) were the same. Then there is GNU gas, which is multi-architecture, though usually not using the syntax of the one designed for each architecture, and often different opcode mnemonics. Gah4 (talk) 04:19, 12 September 2020 (UTC)
- Half a century ago there were assemblers from UNIVAC and SDS targeted to multiple architecture, long before gas. Shmuel (Seymour J.) Metz Username:Chatul (talk) 19:44, 13 September 2020 (UTC)
- The article may say that but it isn't the case. An assembly language is designed to be used with a specific processor, or processor family, not with a specific machine.
- 216.152.18.132 (talk) 02:06, 3 August 2021 (UTC)
- Except when it isn't; some assemblers are table driven and can handle multiple architectures. Meta-Symbol[1][2] and Meta-Assembler[3] (MASM) go back to the 1960s, and more recently there is gas. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 09:41, 3 August 2021 (UTC)
References
- ^ SYMBOL and META-SYMBOL Reference Manual for 900 Series/9300 Computers (PDF). Scientific Data Systems. March 1969. 90 05 06G. Retrieved August 3, 2021.
- ^ Xerox Meta-Symbol Sigma 5-9 Computers Language and Operations Reference Manual (PDF). Xerox Data Systems. Oct 1975. 90 09 52G. Retrieved August 3, 2021.
- ^ Sperry Univac Computer 1100 Series Meta-Assembler (MASM) Programmer Reference (PDF). Revision 1. Sperry Univac Computer Systems. 1977. UP-8453. Retrieved August 3, 2021.
GA Review
[edit]GA toolbox |
---|
Reviewing |
- This review is transcluded from Talk:Assembly language/GA1. The edit link for this section can be used to add comments to the review.
Reviewer: Wasted Time R (talk · contribs) 13:20, 17 September 2020 (UTC)
This looks to have been a drive-by nomination made by an erratic editor who has since been indef-blocked for incompetence. The article has large swaths of unsourced material, not just explanatory material but historical and analytical as well. In some cases there are whole sections without any citations. So this has to be a fail.
But the article is not bad at all. Content-wise, my main suggestion for improvement is that the use of assembly language for IBM mainframes needs to be given more attention. It is mentioned here and there, but back in the heyday of the IBM 360/370, when it was the dominant computing platform in the industry, assembly language was everywhere, not just for high-performance system software components but for run-of-the-mill business applications as well. Learning 360/370 Assembly was part of the standard education that commercial programmers had to get and there were a lot of textbooks and commercial courses available for it. For instance, in the textbook Kevin McQuillen, System/360–370 Assembler Language (OS) (Mike Murach & Associates, 1975), the example programs that the text develops concerns a batch inventory control and reorder application, and later parts of a batch payroll application are constructed. This whole aspect of historical assembly language use is counter-intuitive to today's reader and part of the value that this article can bring is to describe it. Wasted Time R (talk) 13:20, 17 September 2020 (UTC)
Other Platforms
[edit]I would suggest adding more information on the use of assembly language on platforms from other vendors, not just on other IBM platforms. I know for a fact that CDC and RCA used assemblers as implementation languages on their operating systems, and I'm confident that many others did as well. Similarly, there was a lot of customer use of assembly languages on non-IBM platforms. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 14:21, 10 September 2021 (UTC)
Macro facilities in open code
[edit]@Wtshymanski: In many assemblers, pseudo-ops used inside of macro definitions can also be used in open code, and the article does not discuss this. As a start, I added the text below, which user:Wtshymanski reverted:
In addition, some of the assembler statements useful in macro definitions are also valid in open code, e.g., the HLASM statements
- AGO
- Transfer to specified assembler statement
- AIF
- Evaluate logical and transfer if true
- GBLx
- Define compile-time variables in a global context
- LCLx
- Define compile-time variables in a local context
- SETx
- Evaluate expressions and assign their values to compile time variables
There is a lot of code that uses these facilities outside of macro definitions, and I believe that the existing text on assembly language macros is misleading without a discussion of the use of them in open code. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 14:40, 10 September 2021 (UTC)
Tiobe mixing assembly language with WebAssembly ?
[edit]This article cites Tiobe reporting around 2.5% usage of assembly language. WebAsembly is not listed at all amongst 100 languages. Given the expectation that the latter is used more than the former, Tiobe may have mixed up these two. The Wikipedia article on the latter confusingly refers to the former. — Preceding unsigned comment added by Jgeer (talk • contribs) 23:27, 7 November 2021 (UTC)
Non sequitur
[edit]GliderMaven GliderMaven merged two sentences to read Because assembly depends on the machine code instructions, each assembly language is specific to a particular computer architecture and sometimes to an operating system.
However, the reason that FORTRAN Assembly Program (FAP) on the FORTRAN Monitor System differs from Macro Assembly Program on IBSYS/IBJOB and Assembler D on DOS/360 differs from Assembler F on OS/360 has nothing to do with dependency on the machine code, since the machine code is identical. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 14:03, 11 November 2021 (UTC)
- FORTRAN Monitor System ran on the 709, 7090, and 7094, as did IBSYS/IBJOB. Assembler D and F ran on System/360. Machine code for 709, 7090, and 7094 is different from System/360, because they are different computer architectures. It's reasonable to say that if there are two different sets of machine instructions, two different assemblers will be required. (Although it's possible to create one assembler program that uses nearly the same mnemonics for two or more architectures, and produces different machine code as commanded by some sort of mode setting.) Jc3s5h (talk) 18:02, 11 November 2021 (UTC)
- Yes,
It's reasonable to say that if there are two different sets of machine instructions, two different assemblers will be required
, but in this case it's four assemblers for two architectures. I wasn't citing four distinct assemblers for a single architecture, but rather two distinct assemblers for each of the architectures. - I believe that some microprocessors have a lot more than two distinct assemblers.
- There are assemblers that let you specify a different opcode table in order to support multiple architectures. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 21:05, 11 November 2021 (UTC)
- So let's get this straight, you think that stating that each assembler is specific to a machine code and sometimes an operating system, you think that it means that each machine code and operating system has only one assembler???? GliderMaven (talk) 04:44, 12 November 2021 (UTC)
- Um. No. That's very much like saying that all lions are cats, so therefore all cats are lions. It's faulty logic. One does not imply the other. At all. GliderMaven (talk) 04:44, 12 November 2021 (UTC)
- You are totally misrepresenting what I wrote and what I think. The issue is not the word specific; the issue is the word because. The text from because to the comma is incorrect for assemblers specific to an operating system. That is the obvious reason that there were originally two sentences. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 14:49, 12 November 2021 (UTC)
- As usual, it is complicated. Some of the OS difference is actually different macro sets that could be used with the same assembler. I suspect that the assemblers and macros for Linux/390 are very different from OS/390, even for the same hardware. One of the fun things you can do with macros is to make an assembler for a completely different machine. This was usual in the early microcomputer days, where macros for OS/360 assemblers would generate 8080 or 6502 code. There was a program to reformat the object program into the usual form. Gah4 (talk) 22:23, 12 November 2021 (UTC)
- Macro definitions are typically in separate libraries. The differences between FAP and MAP, or among Assembler D, Assembler F and Assembler XF include differences in the pseudo-ops. The diffences among microprocessor assemblers include the order of operands in machine instructions. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 18:04, 14 November 2021 (UTC)
- Well, for example, OS/360 and DOS/360 use completely different macros, not only the lower levels. I believe that there is Assembler D for OS and DOS, but don't know if they share code. Linux/390 uses ASCII, though some assemblers for EBCDIC systems will accept ASCII source. And then there are cross assemblers. There are just so many different things that have been tried, that it is hard to say more. Gah4 (talk) 05:15, 15 November 2021 (UTC)
- Macro definitions are typically in separate libraries. The differences between FAP and MAP, or among Assembler D, Assembler F and Assembler XF include differences in the pseudo-ops. The diffences among microprocessor assemblers include the order of operands in machine instructions. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 18:04, 14 November 2021 (UTC)
- As usual, it is complicated. Some of the OS difference is actually different macro sets that could be used with the same assembler. I suspect that the assemblers and macros for Linux/390 are very different from OS/390, even for the same hardware. One of the fun things you can do with macros is to make an assembler for a completely different machine. This was usual in the early microcomputer days, where macros for OS/360 assemblers would generate 8080 or 6502 code. There was a program to reformat the object program into the usual form. Gah4 (talk) 22:23, 12 November 2021 (UTC)
- You are totally misrepresenting what I wrote and what I think. The issue is not the word specific; the issue is the word because. The text from because to the comma is incorrect for assemblers specific to an operating system. That is the obvious reason that there were originally two sentences. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 14:49, 12 November 2021 (UTC)
- Yes,
Yes, DOS/360 and OS/360 have different macro libraries, but they are not part of the assemblers. Assembler D is DOS only, assemblers E and F are is CP-67, OS/360 and TSS/360 and Assembler XF is in at least DOS/VSE, OS/VS1, OS/VS2 and VM/370. I believe that Assemblers E and F share code. All of which confirms that the because is incorrect. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 14:08, 15 November 2021 (UTC)
Macro pseudo-ops in open code
[edit]@Wtshymanski: In several assemblers, pseudoops meant for defining macros can also be used in open code.
I added the text "
In addition, some of the assembler statements useful in macro definitions are also valid in open code, e.g., the HLASM statements
- AGO
- Transfer to specified assembler statement
- AIF
- Evaluate logical and transfer if true
- GBLx
- Define compile-time variables in a global context
- LCLx
- Define compile-time variables in a local context
- SETx
- Evaluate expressions and assign their values to compile time variables
" to Assembly language#Macros and Wtshymanski reverted the change, stating that it was out of place. I don't see anything wrong with either the text or its location. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 00:18, 19 November 2021 (UTC)
- @Wtshymanski: The list of pseudo-ops may be TMI, but surely the fact that they are allowed in open code belongs there. How about just "
In addition, some of the assembler statements useful in macro definitions are also valid in open code
"? --Shmuel (Seymour J.) Metz Username:Chatul (talk) 01:42, 23 November 2021 (UTC)- This article doesn't define "open code" so the phrase is meaningless to the reader. This reader, anyway. --Wtshymanski (talk) 03:22, 24 November 2021 (UTC)
- @Wtshymanski: Surely it should, since some pseudo-ops are invalid in open code, e.g., MEXIT in HLASM. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 11:46, 24 November 2021 (UTC)
- When it comes to defining what features work in which implementations, we leave the realm of an encyclopedai article and descent to the level of a textbook...or a programmer's manual or how-to guide. The first dozen hits on Google Books for "open code" are split between "open source" and food best-fefore dates written in plain language instead of a cipher. --Wtshymanski (talk) 19:39, 24 November 2021 (UTC)
- @Wtshymanski: "open code" = "outside of macro definitions". It's a pretty well understood term among assembler programmers. Peter Flass (talk) 20:09, 24 November 2021 (UTC)
- When it comes to defining what features work in which implementations, we leave the realm of an encyclopedai article and descent to the level of a textbook...or a programmer's manual or how-to guide. The first dozen hits on Google Books for "open code" are split between "open source" and food best-fefore dates written in plain language instead of a cipher. --Wtshymanski (talk) 19:39, 24 November 2021 (UTC)
- @Wtshymanski: Surely it should, since some pseudo-ops are invalid in open code, e.g., MEXIT in HLASM. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 11:46, 24 November 2021 (UTC)
- This article doesn't define "open code" so the phrase is meaningless to the reader. This reader, anyway. --Wtshymanski (talk) 03:22, 24 November 2021 (UTC)
- You’re right that google shows a lot of irrelevant results in a generic search for “open code”, but it’s not just an IBM-ism, but is regularly used when talking about macros and conditional assembly. For example, here’s one result from a book on MASM programming. [5].if the term is used here, however, it should probably be defined.Peter Flass (talk) 02:27, 25 November 2021 (UTC)
It is useful to include features which exist in assemblers for a variety of machines. In this case, the actual features are assembler variables and conditional assembly. That is, the assembler equivalent of C's #define and #ifdef. (More generally, #if.) AGO allows for loops, which might be more rare for assemblers. These are similar to the features of the PL/I preprocessors, and as well as I know, implemented in a preprocessor stage by assemblers. (That is, a temporary file is written for later processing.) The more general case should be covered here. Gah4 (talk) 23:41, 23 June 2024 (UTC)
- While many simple assemblers have a separate preprocessor stage, but that is by now means universal. Specifically, the IBM assemblers Assembler H Versions 1 and 2 and High Level Assembler (HLASM) allow macro to query attributes of symbols even when they are defined later in the source code than the macro definition and invocation. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 14:18, 24 June 2024 (UTC)
Consistency of English articles
[edit]There is a general rule in English (from various grammar books) that the usage like "The English language" requires the definite article.
As I understand from this article, there is a family of "assembly languages" (which would require one of them to be "an assembly language", and so it is written in Simple English Wikipedia) and one "assembler language" ("the assembler language", cf. IBM). Some sources also capitalize this word.
Unfortunately I'm not a native speaker, so I'm not sure what is correct (and I suspect that the contributors to this article are not exclusively native speakers). Probably all variants are, but maybe there should be a single variant across this article or Wikipedia? Maybe even add a section on its spelling and article (maybe to Wiktionary)?
Yaroslav Nikitenko (talk) 16:02, 13 February 2022 (UTC)
- I am a native English speaker, and an electronics engineer who has written short programs in assembly languages for IBM computers and other processors, such as Intel. I have also designed integrated circuits that physically execute the associated machine instructions. I have never noticed any consistent distinction between "assembly language" and "assembler language". Jc3s5h (talk) 16:34, 13 February 2022 (UTC)
- There is no "the assembler language of IBM"; IBM has provided many assemblers for many different machines and with wildly varying syntax. The link that you gave is for a specific assembler, HLASM, and it does not look remotely like assemblers for other product kines, e.g., 7070 Autocoder. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 14:53, 14 February 2022 (UTC)
- I suspect I have been wondering about this, almost as long as I knew about assembly language. And pretty much, I still don't know. Mostly I remember hearing assembly language when spoken, and maybe half and half when written. One of those many cases where the English language doesn't do what you think it should. Gah4 (talk) 02:46, 30 November 2023 (UTC)
- In my four decades of programming and IC-design experience, I've always heard it called "assembly language." Digital27 (talk) 03:21, 30 November 2023 (UTC)
- I've been programming since 1960, and have heard a variety[a] of terms, including assembler language and assembly language. The key point is that neither term specifies a particular language, but rather a diverse family of languages. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 15:56, 30 November 2023 (UTC)
- In my four decades of programming and IC-design experience, I've always heard it called "assembly language." Digital27 (talk) 03:21, 30 November 2023 (UTC)
- I suspect I have been wondering about this, almost as long as I knew about assembly language. And pretty much, I still don't know. Mostly I remember hearing assembly language when spoken, and maybe half and half when written. One of those many cases where the English language doesn't do what you think it should. Gah4 (talk) 02:46, 30 November 2023 (UTC)
Notes
- ^ Within a single shop the usage tended to be consistent except when there were multiple machine installed.
Assembly Language Primer For Hackers
[edit]I don't know if there is a good place to include this as a reference, but there is a short video series that is very good at explaining how assembly language works called Assembly Primer For Hackers. Hopefully there is a place to use this as a reference. Maybe in external links? -- Ubh [talk... contribs...] 05:51, 18 November 2023 (UTC)
- I watched the first few minutes of the introduction and it appears to be tailored to assemblers on the Intel 32-bit x86 rather than assemblers in general. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 18:59, 29 November 2023 (UTC)
Support for truncated-address architecture
[edit]Some architectures, e.g., IBM System/360,[1] UNIVAC III,[2] have truncated addressing; an instruction does not have enough room for a full address, only an offset against a specified register. Assemblers[3][4][5][a] for those architecture typically have special feature to assist in dealing with addressing, e.g., DSECT and USING for S/360 through IBM z/Architecture. I believe that the article should discuss the issue and give some illustrative examples. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 21:13, 20 June 2024 (UTC)
- Truncated addressing seems like it goes to address mode, but yes, the assembler features needed could go here. Many RISC processors seem to have 32 bit instructions, and 32 bit addresses, which requires some way to encode an address in two instructions. That, then, usually means some assembler feature to generate such code. I would guess that a feature like DSECT isn't so unusual, but USING might be rare. Gah4 (talk) 00:26, 21 June 2024 (UTC)
- Many more recent processors use PC relative addressing, which avoids the need for USING for code addresses. Addressing DSECT is a different question, and I don't know enough different assemblers to know. And I think PC relative addressing should have its own page. Gah4 (talk) 04:48, 21 June 2024 (UTC)
- Describing PC-relative addressing and describing assembler support for it are two separate issues, and I believe that there is a case for both. Note that processors as far back as Atlas, GE 635 and DEC PDP-11 allowed indexing by the program counter, long before RISC architectures, e.g., IBM 801 , Power ISA. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 10:59, 21 June 2024 (UTC)
- Note also that not all PC-relative addressing is general; some architectures support only PC-relative branching. SPARC, for example, has PC-relative branching with 22-bit displacements, plus a procedure-call instruction with a 30-bit displacement (if the upper 2 bits of the instruction are 01, it's a CALL instruction). Load and store instructions have either 2 register operands, or a register operand and a 13-bit offset, that are added together to generate the memory address. There's no equivalent of USING in SPARC assembler languages, but relatively little assembler-language programming is done for SPARC, unlike System/3x0. (And, for an OS that runs on S/390 and z/Architecture, related to the OSes that run on SPARC, the assembler doesn't, as far as I know, offer any equivalent to USING, as there's not much assembler-language programming done there, either - about 28 files with 4035 lines in the kernel and 91 files with 11160 lines in GNU libc.) — Preceding unsigned comment added by Guy Harris (talk • contribs) 20:13, 21 June 2024 (UTC)
- Modern processors discourage mixing of code and data. Especially those with separate code and data cache, where it causes much problems. (That is, slow execution.) I am not, then, surprised that they don't supply PC relative data references. A common use of USING, along with DSECT, is named references to structure members. I suspect some assemblers have a way to do that. (That is, what would be part of a C struct.) USING used to be used for data references in the code, but, as above, that is now discouraged. Gah4 (talk) 21:46, 21 June 2024 (UTC)
Modern processors discourage mixing of code and data.
Heck, the PDP-11 at least didn't encourage it, especially with separate I and D space. Unix programs tended to be built with separate code and data segments, and the code segment was often mapped shared and read-only; some DEC operating systems may have done the same. PC-relative data references were, I think, mostly auto-increment; that's how immediate operands were implemented....named references to structure members. I suspect some assemblers have a way to do that.
Named references to structure members is independent of the size of displacements in instructions. 4BSD, at least, had a C program,genassym
, which included a bunch of system headers and generated a file with a bunch of#define
s for the offsets of various structure members, as part of the kernel generation and build process. The resultingassym.s
file would be#include
d by assembler-language files that needed to refer to those structures. UN*X assemblers tended not to provide full-blown macro facilities or other assists such as DSECTs, as it was not expected that assembly language would be used except in rare cases where either 1) you needed to use specialized instructions to control the machine or 2) hand-coded some low-level routines such as memory copying, string manipulation, and language support such as larger-than-word-size integer arithmetic (e.g., 32-bit integers on a 16-bit platform or 64-bit integers on a 32-bit platform). They sometimes supported using the C preprocessor as a primitive macro facility, but that's about it. Guy Harris (talk) 22:23, 21 June 2024 (UTC)
- Actually, HLASM runs on zLinux, although probably not Solaris; it's a gas (gd&r).
- Somewhat perversely, z/Architecture has a 16-bit relative and a 32-bit relative long; IMHO a 64 KiB single code section is much too large, to say nothing of 4 GiB code sections. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 00:02, 22 June 2024 (UTC)
- I suspect I have been surprised by many features added to z/Architecture by now. But some relative branches can be resolved at link time, and so branch more than a single code section. If the linkage editor still has the ability to relink previously linked code, it will need to be able to unresolve such branches. As far as I know, Java still has a 64K byte limit for a single method. Gah4 (talk) 01:10, 22 June 2024 (UTC)
- I doubt that the linkage editor can handle branch relative berween CSECTs, but possibly the Binder can. I'll have to check the Program Management manual. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 04:10, 23 June 2024 (UTC)
Actually, HLASM runs on zLinux, although probably not Solaris
...because OpenSolaris for System z was discontinued; otherwise, IBM might have ported it.- Was HLASM ported to Linux to support moving existing assembler-language code to Linux? Guy Harris (talk) 06:22, 22 June 2024 (UTC)
- That would be my guess. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 04:02, 23 June 2024 (UTC)
- Could the huge range of the relative long instructions be for compiler support of monolithic C, COBOL, FORTRAN and PL/I applications? -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 04:02, 23 June 2024 (UTC)
- On Linux (and, were the project to have finished, on Solaris), it would more likely have been for compiler support of large shared libraries in C/C++/etc.; for executable images, PIC is only necessary if you're building position-independent executables whose starting address can be randomized. I can't speak for z/OS (or VSE or z/TPE). Guy Harris (talk) 10:47, 23 June 2024 (UTC)
- I suspect I have been surprised by many features added to z/Architecture by now. But some relative branches can be resolved at link time, and so branch more than a single code section. If the linkage editor still has the ability to relink previously linked code, it will need to be able to unresolve such branches. As far as I know, Java still has a 64K byte limit for a single method. Gah4 (talk) 01:10, 22 June 2024 (UTC)
- Modern processors discourage mixing of code and data. Especially those with separate code and data cache, where it causes much problems. (That is, slow execution.) I am not, then, surprised that they don't supply PC relative data references. A common use of USING, along with DSECT, is named references to structure members. I suspect some assemblers have a way to do that. (That is, what would be part of a C struct.) USING used to be used for data references in the code, but, as above, that is now discouraged. Gah4 (talk) 21:46, 21 June 2024 (UTC)
- Note also that not all PC-relative addressing is general; some architectures support only PC-relative branching. SPARC, for example, has PC-relative branching with 22-bit displacements, plus a procedure-call instruction with a 30-bit displacement (if the upper 2 bits of the instruction are 01, it's a CALL instruction). Load and store instructions have either 2 register operands, or a register operand and a 13-bit offset, that are added together to generate the memory address. There's no equivalent of USING in SPARC assembler languages, but relatively little assembler-language programming is done for SPARC, unlike System/3x0. (And, for an OS that runs on S/390 and z/Architecture, related to the OSes that run on SPARC, the assembler doesn't, as far as I know, offer any equivalent to USING, as there's not much assembler-language programming done there, either - about 28 files with 4035 lines in the kernel and 91 files with 11160 lines in GNU libc.) — Preceding unsigned comment added by Guy Harris (talk • contribs) 20:13, 21 June 2024 (UTC)
- Describing PC-relative addressing and describing assembler support for it are two separate issues, and I believe that there is a case for both. Note that processors as far back as Atlas, GE 635 and DEC PDP-11 allowed indexing by the program counter, long before RISC architectures, e.g., IBM 801 , Power ISA. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 10:59, 21 June 2024 (UTC)
Notes
- ^ Use the most recent HLASM reference.
References
- ^ IBM System/360 Principles of Operation (PDF). Systems Reference Library (Fourth ed.). IBM. September 1968. A22-6821-7. Retrieved June 20, 2024.
- ^ Reference Manual - UNIVAC III General - Data Processing System (PDF). Sperry Rand Corporation. 1962. UT-2488. Retrieved June 20, 2024.
- ^ UNIVAC III General Reference Manual - S A L T (PDF). Sperry Rand Corporation. 1962. UP·2558. Retrieved June 20, 2024.
- ^ OS Assembler Language - OS Release 21 (PDF). Systems Reference Library (Twelfth ed.). IBM. April 1976. GC28-6514-11. Retrieved June 20, 2024.
- ^ High Level Assembler for z/OS & z/VM & z/VSE - 1.6 - Language Reference (PDF). Systems Reference Library. IBM. 2021. SC26-4940-09. Retrieved June 20, 2024.
DSECT
[edit]IBM assemblers use DSECT in about the way that C programmers use struct. That is, for computing offsets into data structures. (In the case of struct pointers, that is all the compiler does. For an actual struct, it also allocates memory.) What I am wondering now, is which other assemblers have a similar feature? This would be the case where the assembler computes the offsets, and not a series of (the equivalent of) #define. Gah4 (talk) 23:31, 23 June 2024 (UTC)
- Former good article nominees
- B-Class level-4 vital articles
- Wikipedia level-4 vital articles in Technology
- B-Class vital articles in Technology
- B-Class Computer science articles
- High-importance Computer science articles
- WikiProject Computer science articles
- B-Class Computing articles
- High-importance Computing articles
- B-Class software articles
- High-importance software articles
- B-Class software articles of High-importance
- All Software articles
- B-Class Early computers articles
- High-importance Early computers articles
- B-Class Early computers articles of High-importance
- All Computing articles