How did I learn Assembly of x86 (Part 1)

Tram Ho

At first, I also thought that Assembly was very difficult – “Understanding how the hell is that pile of EAX EBX ???”.
But believe me, Assembly is not that difficult. At least until now (I just learned a little), I find working with “EAX EBX pile” is much less headache than the algorithm: v

Let’s smell the onion together with Assembly x86: v (x86 ~ 32bit and x64 ~ 64bit)

1. Set up a “learning” environment

1.1 Windows

On Windows, you can install masm and code in Visual Studio :

Apart from Visual Studio, I find the SASM IDE is also quite good. The advantage is that it is lightweight, can compile both x86 and x64 if later learning on x64.

For convenience, when learning Assembly, everyone should install the IDE to debug in the same place. According to my personal experience, when learning new without debugger to run, then discard half.

1.2 Ubuntu

On ubuntu, it is simpler to type a command to install nasm (masm for Windows and nasm for Linux).
People just need to copy this shell script into a text file, change the file extension to .sh .

Then run: bash <tên file>.sh

For other versions of ubuntu 18.04, everyone should replace 18.04 in the link in the wget command to their version. Note that currently there is no nasm for Ubuntu versions higher than 18.04.

2. Background knowledge

After setting up the environment, everyone should leave it there, learn more about the foundation knowledge.
Some of what we have known about before but not carefully, some are completely new. We need to master this knowledge before embarking on the code, or will encounter a bunch of errors without understanding why, not knowing what to fix.
Since some knowledge is so common, I will only talk more about the ordinary things we rarely know.

2.1 Data representation

In Assembly we can represent data in the form of:

  • Binary (Binary)
  • Hexadecimal (Hex)
  • Decimal
  • Octal (Octal)

The four types of representations are actually for people to see, no matter what they appear, the computer will only store and work with data in binary form. Usually the data will be displayed as a Hex for us to see, because moving from Bin to Hex is more convenient from Bin to Dec.

Do you know why the sunbaes designed the computers transferred from Bin -> Hex for us to read?
And not from Bin -> Dec, while humans are accustomed to working with the most decimal?
The answer is at the end of the lesson

The x86 assembly will work with 32bit data, the representation is as follows:

Performance typeMinMax
Bin11111111 11111111 11111111 1111111001111111 11111111 11111111 11111111
hex80 00 00 007F FF FF FF

Looking at min = 80 00 00 00 and max = 7F FF FF FF is a bit strange, right? Obviously 7F FF FF FF + 1 = 80 00 00 00 which?
Rest assured, it is because of the way the computer performs with the signed number.

2.2 Represents a signed integer in a calculator

2.2.1 Quantitative method

In the diacritics method, the first left bit (MSB – Most Significant Bit) is used as the sign bit. If the number is positive then the sign bit = 0, while the negative number then the sign bit = 1.

The remaining bits will be used to represent values. Thus the range of values ​​that can be expressed will be only -2 ^ (n – 1) -> 2 ^ (n – 1) – 1.

For example:

  • Number 5 in binary: 00000101
  • Number -5 in binary: 10000101

This representation has 1 inconsistency, that is, 0 will have 2 expressions: 00000000 (+0) and 10000000 (-0)

This method was used by first generation computers such as IBM.

2.2.2 Compensation method 1

The offset method 1 is similar to the sign method, except that it expresses the magnitude of the number:

  • If the number is positive then the sign bit = 0
  • If it is negative, invert the value bits, the sign bit = 1

For example:

  • Number 5 in binary: 00000101
  • Number -5 in binary: 11111010

With the compensation method 1, there are still 2 representations of 0, 00000000 (+0) and 11111111 (-0)

When performing addition with a complementary binary number 1, if after adding the memory bit = 1, the memory bit must be added to the obtained result. This is inconvenient when the computer performs calculations.

The 1st compensation method is used by old generation computers such as PDP, UNIVAC.

2.2.3 Compensation method 2

The compensation method 2 has been improved to make it easier to calculate with binary numbers, specifically:

  • If the number is positive then the sign bit = 0
  • If the number is negative, the sign bit = 1, reverses the value bits and adds 1 to the result.

The complementary representation 2 method was created to overcome the two problems of the volume sign and compensation method 1:

  • Number 0 has 2 representations
  • The memory bits generated after the calculation has been performed must be added to the result

The conversion of a number of positive <<==> words is done in a unique way: invert all the bits and add 1 .
When performing addition plus 2, if a memory bit is generated at the sign bit, we can omit it.

Currently modern computers use the compensation method 2.

2.2.4 Compensation 2 with base 16

Compensation 2 in base 16 doesn’t change, it’s still inverting all the bits and adding 1 . The bit reversal in Hex is very simple, just subtract every Hex character from 15:

6A3D -> 95C2 + 1 -> 95C3
95C3 -> 6A3C + 1 -> 6A3D

2.3 Storage size

The basic data storage unit of 32bit computers is 8bit byte . On bytes there are also words (2 bytes), doubleword (4 bytes) and quadword (8 bytes).

Their performance range is as follows:

2.4 String representation

The computer string is represented by an array of values ​​corresponding to that character in the ASCII table. Usually, it will be written as Hex, few people will write it as a decimal.
For example, the string “ABC123” can be displayed as 41h, 42h, 43h, 31h, 32h, 33h.

2.5 Boolean algebraic expressions

There are 4 boolean expressions:

  • AND
  • OR
  • NOT
  • XOR

These expressions are too basic so I will not mention more details. In Assembly, these expressions will be used a lot. For example: using AND to set bits, XOR to delete bits, …. and many other good uses that we would not know if not learning Assembly: 3


Share the news now

Source : Viblo