Assembly language is a low-level language used in electronic computers, microprocessors, microcontrollers, or other programmable devices, also known as symbolic language. In assembly language, mnemonics are used instead of opcodes for machine instructions, and address symbols (Symbol) or labels are used instead of addresses for instructions or operators. In different devices, assembly language corresponds to different sets of machine language instructions, which are converted into machine instructions through the assembly process. Generally speaking, a particular assembly language and a particular machine language instruction set are one-to-one, and are not directly portable between platforms.
Many assembly programs provide additional support mechanisms for program development, assembly control, and debugging assistance. There are assembly language programming tools that often provide macros, and they are also called macro assemblers.
Assembly language is not as widely used for programming as most other programming languages. In today's real-world implementations, it is typically used for under-the-hood, hardware operations and demanding program optimization. Drivers, embedded operating systems, and real-time running programs all require assembly language.
Basic introduction Chinese name : Assembly Language Foreign name : Assembly Language Subject : Software Engineering Generation age : 1950s Compilation mode : Assembly Development history, language features, general features, advantages, disadvantages, language composition, data transfer instructions, integer and logic instructions, shift instructions, bit manipulation instructions, conditional setting instructions, control transfer instructions, string manipulation instructions, shift instructions, bit operation instructions, control transfer instructions. Assembler, compilation environment, development prospects, practical application, classic textbooks, x86 processor, ARM and microcontroller, development history When it comes to the generation of assembly language, we should first talk about machine language. Machine language is a collection of machine instructions. A machine instruction is, in expanded terms, a command that a machine can execute correctly. The machine instructions of a computer are a sequence of binary digits. The computer transforms this into a sequence of high and low levels so that the computer's electronics are driven to perform operations. The term computer as used above refers to a machine that can execute machine instructions and perform arithmetic. This was the concept of early computers. In the PCs that we commonly use, there is a chip that performs the functions of the computer described above. This chip is what we often refer to as the CPU (Central Processing Unit). Each microprocessor, due to the different hardware design and internal structure, needs to be controlled by different level pulses to make it work. Therefore, each microprocessor has its own set of machine instructions, also known as machine language. Early programming used machine language. Programmers punched a program code made up of 0, 1 numbers on paper tape or card, with the 1s punched and the 0s unpunched, and then fed the program into a computer through a paper tape machine or a card machine to perform calculations. Such a machine language, consisting purely of 0s and 1s, was complex, not easy to read or modify, and prone to errors. Programmers soon discovered the troubles associated with the use of machine languages, which were difficult to recognize and memorize, and created a barrier to the development of the industry as a whole, and assembly language was created. The main body of assembly language is the assembly instructions. The difference between assembly instructions and machine instructions is in the way the instructions are represented. Assembly instructions are machine instructions written in a format that is easy to memorize. Operation: contents of stager BX sent to AX 1000100111011000?Machine instructions mov?ax,bx Assembly instructions Thereafter, programmers used assembly instructions to write source programs. However, the computer can only read and understand machine instructions, so how to make the computer execute the program written by the programmer in assembly instructions? At this point, it is necessary to have a translation program that can convert assembly instructions into machine instructions, and such a program is called a compiler. The programmer writes the source program in assembly language and compiles it into machine code with an assembly compiler for final execution by the computer. Process Language Features Assembly language is a programming language that is directly oriented to the processor. Processors work under the control of instructions, and each instruction that a processor can recognize is called a machine instruction. Each processor has its own recognizable set of instructions, called an instruction set. When a processor executes instructions, it takes different actions and accomplishes different functions according to different instructions, both to change its own internal operating state and to control the operating state of other peripheral circuits. Another feature of assembly language is that the object it operates is not specific data, but the temporary memory or memory, that is to say, it is directly and temporary memory and memory to deal with, which is why the speed of the assembly language than other languages, but at the same time, this also makes the programming more complex, because since the data is stored in the temporary memory or memory, there must be an address, that is, what method to find the required data. What is the method to find the required data. For example, in the above example, we can't use the data directly as in a high-level language, but first we have to take the data out of the corresponding registers AX and BX. This also increases the complexity of programming, because in high-level languages, addressing is done by the compilation system, while in assembly language, it is done by the programmer himself, which increases the complexity of programming and reduces the readability of the program. Moreover, assembly language instruction is a symbolic representation of machine instructions, and different types of CPUs have different machine instruction systems, which also have different assembly languages, so the assembly language program has a close relationship with the machine. Therefore, in addition to the same series, different models of CPU assembly language program between a certain degree of portability, other different types (such as: small machines and microcomputers, etc.) between the CPU assembly language program is not portable, that is to say, the generality of the assembly language program and portability than the high level of language program is low. Because assembly language has the characteristic of "relevance to the machine", when programmers write programs in assembly language, they can make reasonable arrangements for the various resources inside the machine, so that they are always in the best state of use. The program written in this way has a short execution code and fast execution speed. Assembly language is the closest and most direct relationship with the hardware of a variety of programming languages, in the time and space efficiency is also the highest one, it is the higher education institutions of computer set of technology must be one of the professional courses, for training students to master the program design technology, familiar with the operation of the machine and the program debugging technology has an important role in the overall characteristics of 1. This is a machine-oriented low-level language, usually designed for a specific computer or series of computers. Because it is a symbolic representation of machine instructions, there are different assembly languages for different machines. The use of assembly language can be machine-oriented and better exploit the characteristics of the machine to get a higher quality program. 2. High speed and efficiency Assembly language maintains the advantages of machine language, with direct and simple characteristics, can effectively access, control the computer's various hardware devices, such as disk, memory, CPU, I/O port, etc., and occupies less memory, the implementation of the fast speed, is a high-efficiency programming language. 3. The complexity of writing and debugging is due to the direct control of the hardware, and simple tasks also require many assembly language statements, so in the program design must be all-encompassing, the need to take into account all possible issues, reasonable deployment and use of a variety of software and hardware resources. This inevitably increases the burden of programmers. In the same way, when debugging the program, it is difficult to find out if there is any problem in the running of the program. Advantages 1, because the program designed in assembly language is eventually converted into machine instructions, so it can maintain the consistency of machine language, direct, simple, and like machine instructions to access and control a variety of computer hardware devices, such as disk, memory, CPU, I/O ports and so on. Using assembly language, you can access all the software and hardware resources that can be accessed. 2, the target code is short, occupies less memory, fast execution speed, is a highly efficient programming language, often used in conjunction with high-level language to improve the execution speed and efficiency of the program, to make up for the lack of high-level language in the hardware control, the set is very widely used. Disadvantages 1, assembly language is machine-oriented, at the bottom of the entire computer language hierarchy, so it is regarded as a low-level language, usually for a specific computer or a series of computers specially designed. Different processors have different assembly language syntax and compilers, and the compiled programs cannot be executed on different processors, thus lacking portability; 2. It is difficult to understand the design intent of the program from the assembly language code, and the maintainability is poor, even if it is to complete a simple job, it also requires a large amount of assembly language code, which can easily generate bugs and is difficult to debug; 3. The use of assembly language must have a very good knowledge of a certain processor, and can only be used for a specific architecture. 3, the use of assembly language must be a very good understanding of a certain processor, and can only be optimized for a particular architecture and processor, the development of very low efficiency, long cycle and monotonous. Language Composition Data Transfer Instructions This part of the instructions, including the general data transfer instructions MOV, conditional transfer instructions CMOV, stacking operation instructions PUSH/PUSHA/PUSHAD/POP/POPA/POPAD, exchange instructions XCHG/XLAT/BSWAP, address or segment descriptor selection sub-transfer instructions LEA/LDS/LES/LFS/LGS/LSS and so on. LGS/LSS, etc. Note that CMOV is not a specific instruction but a cluster of instructions that includes a large number of instructions for deciding whether or not to perform a specified transfer operation based on the state of certain bits in the EFLAGS staging register. Integer and Logic Instructions This part of the instructions is used to perform arithmetic and logic operations, including addition instructions ADD/ADC, subtraction instructions SUB/SBB, plus-one instructions INC, minus-one instructions DEC, comparison operation instructions CMP, multiplication instructions MUL/IMUL, division instructions DIV/IDIV, sign extension instructions CBW/CWDE/CDQE, decimal adjustment instruction DAA/DAS/AAA/AAS, logical operation instruction NOT/AND/OR/XOR/TEST and so on. Shift instructions This part of the instruction is used to move the temporary memory or memory operation element for a specified number of times. These instructions include the logical left shift instruction SHL, the logical right shift instruction SHR, the arithmetic left shift instruction SAL, the arithmetic right shift instruction SAR, the cyclic left shift instruction ROL, the cyclic right shift instruction ROR, and so on. Bit operation instructions This part of the instruction includes bit test instruction BT, bit test and set instruction BTS, bit test and reset instruction BTR, bit test and invert instruction BTC, bit forward scan instruction BSF, bit backward scan instruction BSR, etc. Condition Set Instruction This is not a specific instruction, but a cluster of about 30 instructions that are used to set an 8-bit staging device or memory operation element based on the state of certain bits in the EFLAGS staging device. For example, SETE/SETNE/SETGE, etc. Control Transfer Instructions This part includes unconditional transfer instruction JMP, conditional transfer instruction J /JCXZ, loop instruction LOOP/LOOPE/LOOPNE, procedure call instruction CALL, sub-procedure return instruction RET, interrupt instructions INTn, INT3, INTO, IRET, etc. Note that J is a cluster of instructions containing a number of instructions for deciding whether or not to transfer based on the state of certain bits in the EFLAGS staging register; INT n is a soft interrupt instruction, where n can be a number between 0 and 255, used to indicate the interrupt vector number. String Operation Instructions This part of the instruction is used to operate on data strings, including the string transfer instruction MOVS, the string compare instruction CMPS, the string scan instruction SCANS, the string load instruction LODS, and the string save instruction STOS, which can be selectively prefixed with REP/REPE/REPZ/REPNE and REPNZ for continuous operation. Input-Output Instructions This part of the instruction is used to exchange data with peripheral devices, including the line port input instruction IN/INS, and the line port output instruction OUT/OUTS. High-Level Language Auxiliary Instructions This part of the instruction provides convenience for the compiler of high-level language, including the instruction ENTER to create a stack frame, and the instruction LEAVE to release it. Control and Privilege Instructions This part of the instruction provides convenience for the compiler of high-level language, including the instruction ENTER to create a stack frame, and the instruction LEAVE to release a stack frame. strong> Control and Privilege Instructions This section includes the no-operation instruction NOP, the shutdown instruction HLT, the wait instruction WAIT/MWAIT, the code-switching instruction ESC, the bus-blocking instruction LOCK, the memory-range-checking instruction BOUND, the global descriptor table operation instructions LGDT/SGDT, the interrupt descriptor table operation instructions LIDT/SIDT, the local descriptor table operation instructions LIDT/SIDT, the local descriptor table operation instructions LIDT/SIDT, and the local descriptor table operation instructions LIDT/SIDT. SIDT, Local Descriptor Table Operation Instruction LLDT/SLDT, Descriptor Segment Boundary Value Load Instruction LSR, Descriptor Access Right Read Instruction LAR, Task Stagger Operation Instruction LTR/STR, Request Privilege Level Adjustment Instruction ARPL, Task Switching Flag Zeroing Instruction CLTS, Control Stagger and Debugger Stagger Data Transfer Instruction MOV, High-Speed Cache Control Instructions INVD/WBINVD/INVLPG, Model Related Staging Read and Write Instructions RDMSR/WRMSR, Processor Information Getting Instruction CPUID, Timestamp Read Instruction RDTSC, and more. Floating-point and Multimedia Instructions This section of instructions is used to accelerate floating-point data operations, as well as single-instruction, multiple-data (SIMD and its extension, SSEx) instructions to accelerate multimedia data processing. The data in this section is too large to list, so please refer to the INTEL manual for more information. Virtual Machine Extension Instructions This section includes INVEPT/INVVPID/VMCALL/VMCLEAR/VMLAUNCH/VMRESUME/VMPTRLD/VMPTRST/VMREAD/VMWRITE/VMXOFF/VMON, and so on. Related Technologies Assembler A typical modern assembler constructs object code by decoding mnemonics to OpCode from the grouped instruction set and resolving symbolic names to memory addresses and other entities. The use of symbolic references is an important feature of the assembler, as it saves the tedious and time-consuming computation of manual address translation after program modification. It is basically just a way of turning machine code into letters, and then replacing the letters of the input instructions with obscure machine code when compiling. Compilation environment A symbolic program written in a non-machine language, such as assembly language, is called a source program, and the role of the assembly language compiler is to translate the source program into the target program. The target program is a machine language program that can be processed and executed by the computer's CPU when it is placed in a predetermined location in memory. There are generally few debugging environments for assembly, and few very good compilers. The choice of compiler depends on the type of target processor and the specific system platform. In general, a well-functioning compiler should be very easy to use. For example, it should be able to automatically organize formatting, syntax highlighting, and integrate compilation, linking, and debugging into one convenient and practical package. For widely used personal computers, the assembly language compilers that can be freely chosen are MASM, NASM, TASM, GAS, FASM, RADASM, etc., but most of them do not have debugging functions. If it is for learning assembly language, Easy Assembly is a very suitable assembly compiler for beginners because it has a well integrated environment. Development Prospects Assembly language is a machine language mnemonic, compared to the boring machine code is easy to read and write, easy to debug and modify, at the same time, the excellent assembly language designers after clever design, so that the assembly language compiled code than high-level language faster than the implementation of the advantages of less memory space, etc., but the speed of operation and space occupation of the assembly language is aimed at high-level language and need to be cleverly designed, and However, the running speed and space consumption of assembly language are designed for high-level languages and need to be designed skillfully, and some high-level languages also have high efficiency in code execution after compilation, so this advantage is slowly weakening. Moreover, it has obvious limitations when writing complex programs. Assembly language depends on specific models and cannot be generalized or ported between different models. It is often said that assembly language is a low-level language, does not mean that assembly language to be abandoned, on the contrary, assembly language is still a computer (or microcomputer) bottom design programmers must understand the language, in some industries and fields, assembly is essential, non-it can not be applied. Only, now the largest field of computer software for IT, also we often say that the computer set of software programming, in the hands of skilled programmers, the use of assembly language program written in the running efficiency and performance than other languages to write the program is relatively improved, but the price is that it takes longer to optimize, if the computer principles and programming foundation is not solid, but to increase the difficulty of its development, it is really The loss outweighs the gain. Comparing the software development around 2010, it is already a market-oriented software industry, coupled with the excellence of high-level languages and cross-platform, a company cannot let a team use assembly language to write everything, spending several times or even dozens of times the time, it is better to use other languages to complete, as long as the final result is not too much worse than the assembly language written, it can be a step ahead of the completion of the market economy is the inevitable result. inevitable result of the market economy. However, so far, no programmer dares to conclude that assembly language is not necessary to learn. At the same time, assembly language is a machine-oriented programming language, and some of the assembly programmers who have excellent designs have already left the softwares development and crowded themselves into the industrial electronics programming. For relatively small functions but the hard body of the language design requirements of the industry, such as 4-bit microcontroller, due to its capacity and computing, the industry's electronics engineers are generally responsible for the development and design of circuits from the development and control of the software, the main development language is the assembly, the use of the C language accounted for only a small portion of the electronic development engineers are hard to find, in some industrial companies, a core of the electronic engineers than any other employees treatment Are high, in contrast, the general treatment of electronic engineers is more than ten times the programmer. This situation is because since the 21st century, although there are many people learning assembly, but really can learn to be proficient is not much, it is difficult to learn relative to the advanced language, difficult to use, the scope of application of the small, although simple, but too flexible, learning a high-level language to learn assembly than the beginning of the people to learn assembly is much more difficult, but people who have learned the assembly to learn a high-level language is very easy, easy to learn from the complexity of the complexity from the complexity of the complexity of the difficult. For a programmer who fully understands the principles of microcomputers, assembly language is a must. As modern software systems become larger and more complex, a large number of encapsulated high-level languages such as C/C++, Pascal/Object Pascal have emerged. These new languages have made the development process easier and more efficient for programmers, allowing software developers to cope with the demands of rapid software development. The complexity of assembly language has gradually reduced its applicability. However, this does not mean that assembly is no longer useful. Because assembly is closer to machine language, it can operate directly on hardware, generate programs with higher speed and smaller memory than other languages, and is therefore used in a large number of time-sensitive programs, the core modules of many large-scale programs, and industrial control. In addition, despite the large number of programming languages available, assembly is still a mandatory course for computer science majors at universities to provide students with an in-depth understanding of how computers work. Historically, assembly language has been one of the most popular programming languages. As the size of software grew, and with it the demand for progress and efficiency in software development, high-level languages gradually replaced assembly language. But even so, high-level languages cannot completely replace the role of assembly language. Take the Linux kernel for example, although the vast majority of the code is written in C, but still inevitably in some key areas of the use of assembly code. As this part of the code and the hardware is very close, even C language will appear to be overwhelmed, while the assembly language can be very good to maximize the performance of the hardware to take advantage of their strengths and weaknesses. First of all, most of the statements in assembly language directly correspond to machine instructions, which are fast and efficient to execute, and the code is small in size, which is more useful in those occasions where the memory capacity is limited but fast and real-time echo is required, such as in instrumentation and industrial control equipment. Secondly, assembly language can be used in the core part of the system program, and the part that deals frequently with the system hardware. For example, the core program segments of the operating system, the initialization program of the I/O interface circuits, the low-level drivers of external devices, as well as frequently called subroutines, dynamic connectivity libraries, certain advanced drawing programs, video game programs, and so on. Thirdly, assembly language can be used for encryption and decryption of software, analysis and prevention of computer viruses, debugging and error analysis of programs. Finally, the study of assembly language deepens the understanding of courses such as Computer Principles and Operating Systems. By learning and using assembly language, we can perceive, experience and understand the logic function of machines, and lay a technical theoretical foundation for understanding the principles of various softwares systems; and lay a practical application foundation for mastering the principles of hardwares systems. Classical textbooks There are many assembly language textbooks for various processors, roughly counting no less than a hundred. In so many textbooks, the ones used more can be categorized as follows: x86 processor 1. x86 Assembly Language: From Real Mode to Protected Mode, by Li Zhong, Electronic Industry Press, 2013-1 . Based on INTEL x86 processor, NASM compiler and BOCHS virtual machine. Assembly language is the language of the processor, in the sense that since you learn assembly language, you have to program directly towards the hardwares instead of using inexplicable DOS interrupts and API calls. This is an interesting book in that it doesn't spend its pages calculating some boring math problems. Instead, it teaches you how to directly control hardbodies, displaying literals, reading hard disk data, controlling other hardbodies, etc. without the help of BIOS, DOS, Windows, Linux, or any other software support. As we know, 32-bit and 64-bit are mainstream, real mode and DOS operating systems are history, and Linux and Windows work in protected mode. This book talks about the 32-bit protected mode from the real mode, especially focusing on the 32-bit protected mode. Reading this book is very helpful to understand the working principle of modern computers and modern operating systems. 2.Assembly Language (2nd Edition), by Wang Shuang, Tsinghua University Press, 2013-4-1 Based on the INTEL 8086 processor, MASM compiler, and the DOS platform assembly materials, completely based on the real mode of the 8086 processor, does not involve the commonly used 32-bit and 64-bit modes, but because of easy to understand, readers reflect very good. 3. 80X86 Assembly Language Programming Tutorial, Yang Jiwen and other editors, Tsinghua University Press, 1999-3-1 based on the INTEL x86 processor, MASM and TASM compiler, including 16-bit real mode and 32-bit protection mode, and the latter is more detailed. 4. 32-bit Assembly Language Programming, Qian Xiaojie, Mechanical Industry Press, 2011-8-1 Based on INTEL x86 processor, MASM compiler, and WINDOWS platform assembly materials. 5.16/32-bit Microcomputer Principles of Assembly Language and Interface Technology, Qian Xiaojie, Chen Tao, Mechanical Industry Press, 2005-2-1 Based on INTEL x86 processor, discusses the basic principles of 16-bit minicomputer, assembly language and interface technology, and introduces 32-bit microcomputer system related technology. 6.Intel Assembly Language Programming (Fifth Edition), (U.S.) Irving, Electronic Industry Press, 2012-7-1 Based on the INTEL x86 processor, MASM compiler, and the DOS/WINDOWS platform assembly materials, both 16-bit real mode content, as well as 32-bit protected mode content. 7. The Art of Assembly Language Programming (2nd Edition), (U.S.) Hyde, Tsinghua University Press, 2011-12-1 Based on the INTEL x86 processor, the author's self-made High Level Assembler (HLA) is used as a teaching tool to partially obtain the advantages and features of the high-level language. 8. x86 PC Assembly Language, Design and Interface (Fifth Edition), (U.S.) Mazzetti, Cauchy, Electronic Industry Publishing House, 2011-1-1 Based on the INTEL x86 processor, it covers both 16-bit real mode and 32-bit protected mode, with an introduction to 64-bit as well. ARM and microcontroller 1.Assembly Language Programming - Based on ARM Architecture (2nd Edition), Wen Quanguang and other editors, Beijing University of Aeronautics and Astronautics Press, 2010-8-1 Based on the ARM architecture processor, it is an introductory textbook for learning embedded technology. 2. "Zero Basic Learning AVR Microcontroller", Xu Yimin and other editors, Mechanical Industry Press, 2011-1-1 microcontroller overview, avr microcontroller development tools, avr microcontroller c language, atmega16 microcontroller basic structure, avr instruction system and assembly system. 3. Based on Multisim10 51 microcontroller simulation tutorial, Nie Dian, Ding Wei, edited by the Electronic Industry Press, 2010-2-1 Explained the NI Multisim 10 in the simulation of the main functions of the microcontroller. 4. "PIC18 Microcontroller: Architecture, Programming and Interface Design", (U.S.) Berry, Tsinghua University Press, 2009-4-1 Microcontrollers are widely used in automobiles, home appliances, industrial control, medical equipment and many other fields. This book takes Microchip's PIC18 series microcontrollers as an example, and comprehensively explains how to program microcontrollers using C language and assembly language. 5. CASL Assembly Language Programming, edited by Zhao Lihui, China Electric Power Press, 2002-10-1 CASL assembly language is a mandatory part of the Chinese Computer Software Professional and Technical Qualification and Level Examination for Senior Programmer. This book is a monograph on CASL assembly language programming.