Surely anyone who has learned Ethereum will be no stranger to the concept of Ethereum Virtual Machine (EVM). But to really understand the mechanism of how it works is not easy, I have seen it quite complicated and confusing, so I have never delved deeply. In today’s article, I will join you in a closer look at the structure and operation of EVM
1. What is EVM?
EVM is simply a part of the Ethereum network that handles the implementation and execution on smart contracts. ETH transfer transactions from one account after account EOA EOA other EVM will not need treatment. EVM is available on all clients (nodes) of the Ethereum network, aiming to be a global decentralized computer.
The EVM is a complete Turing state machine, because all execution steps are limited to a finite number of computational steps. This is different from Bitcoin when on bitcoin, the Stack Machine is just an incomplete Turing machine.
EVM is designed in stack-based architecture, all stored in stack, word size is 256-bit. The components that store information on the EVM are divided into 3 parts
- A fixed ROM code, cannot be changed. Loaded with the smart contract’s code byte when processing the contract.
- A short term memory. . When you want to store it on Solidity, use the keyword memory
- A long term memory. When you want to store on Solidity, use the keyword storage
2. Script
EVM ‘s scripts are divided into the following categories
1. Mathematical processing
- ADD: Plus
- MUL: Human
- SUB: Subtract
- DIV: Divide an integer
- SDIV: Divide positive integers
- MOD: Modulo
- SMOD: Calculate Modulo a positive integer
- ADDMOD: Addition in the base system
- MULMOD: Multiply in the base system
- EXP: Superfluous
- SIGNEXTEND: Increases the bit space of a positive integer
- SHA3: Calculates the hash value keccak256
2. Commands interacting with the stack
- POP: Removes the element on the top of the stack
- MLOAD: Load 1 word (16 bits) from memory memory
- MSTORE: Save 1 word to memory
- MSTORE8: Save 1 byte in memory
- SLOAD: Loads 1 word from storage
- SSTORE: Save 1 word from storage
- MSIZE: Check the amount of free memory
- PUSHx: Change x byte value on stack (x from 1-32)
- DUPx: Duplicate x th stack (x from 1 to 16)
- SWAPx: Change the position of the 1st stack and the stack (x + 1) (x from 1-16)
3. Interact with the register
- STOP: Stop command
- JUMP: Set the value of PC register to any value
- JUMPI: Change the value condition on the PC register
- PC: Get the value of PC register
- JUMPDEST: Tick
4. System command
- LOGx: Add log with x parameters, (x from 0 to 4)
- CREATE: Create a new account
- CALL: A call to another account
- CALLCODE: Call to the account that is executing the transaction
- RETURN: Pause execution and return output data
- DELEGATECALL: Authorizes the memory manipulation of the contract to another address
- STATICCALL: The call does not change the status
- REVERT: Revert transaction
- INVALID: Invalid specified, execution stopped
- SELFDESTRUCT: Cancel the contract and transfer all balance to another account
5. Logic
- LT: less than comparison
- GT: greater than
- SLT: Compares smaller with positive numbers
- SGT: Bigger comparisons with positive numbers
- EQ: Compare equals
- ISZERO: The NOT operator
- AND: Operator AND
- OR: The OR operator
- XOR: The XOR operator
- NOT: The NOT operator
- BYTE: Gets 1 byte from 256 bytes
6. Environment
- GAS: Query the remaining amount of Gas during transaction execution
- ADDRESS: Get the address value of the transaction execution
- BALANCE: Take account balance
- ORIGIN: Returns value of the address of the user that initiated the transaction
- CALLER: Returns the address to call this transaction (including the contract address)
- CALLVALUE: Returns the amount of eth used in the transaction
- CALLDATALOAD: Returns the input data of the transaction
- CALLDATASIZE: Returns the size of the input data
- CALLDATACOPY: Copy the input data to memory
- CODESIZE: Returns the size of the code in the current environment (EOA will have a size of code equal to 0).
- CODECOPY: Copy code to memory
- GASPRICE: Returns the price of Gas
- EXTCODESIZE: Returns the size of code for any account
- EXTCODECOPY: Copy code to memory
- RETURNDATASIZE: Returns output data
- RETURNDATACOPY: Copy output data to memory
6. The command interacts with the block
- BLOCKHASH: Get the hash of 1 of the last 256 blocks
- COINBASE // Get the block’s beneficiary address for the block reward
- TIMESTAMP: Get the timestamp value of the block
- NUMBER: Get the block number value
- DIFFICULTY: Get the difficulty value
- GASLIMIT: Get the value of gas limit
3. Compile Sodility to EVM ByteCode
In this section we will demonstrate and go into more detail about how Solidity code is compiled into ByteCode. Right now we need to install the Solidity compiler
1 2 | <span class="token function">npm</span> <span class="token function">install</span> -g solc |
We have a simple code file as follows
1 2 3 4 5 6 7 8 9 10 11 | <span class="token comment">// Example.sol</span> pragma solidity <span class="token operator">^</span> <span class="token number">0.6</span> <span class="token number">.12</span> <span class="token punctuation">;</span> contract Example <span class="token punctuation">{</span> address contractOwner <span class="token punctuation">;</span> <span class="token function">constructor</span> <span class="token punctuation">(</span> <span class="token punctuation">)</span> <span class="token keyword">public</span> <span class="token punctuation">{</span> contractOwner <span class="token operator">=</span> msg <span class="token punctuation">.</span> sender <span class="token punctuation">;</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> |
Compile any code
1 2 3 4 | solc -o bytescode --asm ./Example.sol solc -o bytescode --opcodes ./Example.sol solc -o bytescode --bin ./Example.sol |
--asm
gives the output as EVM assembly
, --opcodes
is in the form of OPCODES and --bin
will output the contract’s binary in hex form. To see more options and explanations of each parameter, we solc --help
command solc --help
. The results will be for the following 3 files
Example.opcode
1 2 | PUSH1 0x80 PUSH1 0x40 MSTORE CALLVALUE DUP1 ISZERO PUSH1 0xF JUMPI PUSH1 0x0 DUP1 REVERT JUMPDEST POP CALLER PUSH1 0x0 DUP1 PUSH2 0x100 EXP DUP2 SLOAD DUP2 PUSH20 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF MUL NOT AND SWAP1 DUP4 PUSH20 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF AND MUL OR SWAP1 SSTORE POP PUSH1 0x3F DUP1 PUSH1 0x5D PUSH1 0x0 CODECOPY PUSH1 0x0 RETURN INVALID PUSH1 0x80 PUSH1 0x40 MSTORE PUSH1 0x0 DUP1 REVERT INVALID LOG2 PUSH5 0x6970667358 0x22 SLT KECCAK256 STATICCALL 0xDE PUSH22 0xB41B5EE14B779448E07EE6894AC84FACD068A74CDD3A 0x2B 0xB1 0xF8 CALLER GASPRICE 0xAD 0x49 PUSH5 0x736F6C6343 STOP MOD 0xC STOP CALLER |
Example.evm
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 | /* "./Example.sol":26:132 contract Example {... */ mstore(0x40, 0x80) /* "./Example.sol":72:130 constructor() public {... */ callvalue dup1 iszero tag_1 jumpi 0x00 dup1 revert tag_1: pop /* "./Example.sol":116:126 msg.sender */ caller /* "./Example.sol":100:113 contractOwner */ 0x00 dup1 /* "./Example.sol":100:126 contractOwner = msg.sender */ 0x0100 exp dup2 sload dup2 0xffffffffffffffffffffffffffffffffffffffff mul not and swap1 dup4 0xffffffffffffffffffffffffffffffffffffffff and mul or swap1 sstore pop /* "./Example.sol":26:132 contract Example {... */ dataSize(sub_0) dup1 dataOffset(sub_0) 0x00 codecopy 0x00 return stop sub_0: assembly { /* "./Example.sol":26:132 contract Example {... */ mstore(0x40, 0x80) 0x00 dup1 revert auxdata: 0xa26469706673582212209625abe2c853fe00901ebd3c2636931b5bf7b41ae052accf10eaa6d7d027d7fc64736f6c634300060c0033 } |
Example.bin
1 2 | 6080604052348015600f57600080fd5b50336000806101000a81548173ffffffffffffffffffffffffffffffffffffffff021916908373ffffffffffffffffffffffffffffffffffffffff160217905550603f80605d6000396000f3fe6080604052600080fdfea26469706673582212209625abe2c853fe00901ebd3c2636931b5bf7b41ae052accf10eaa6d7d027d7fc64736f6c634300060c0033 |
Looking at the files above, I really don’t understand anything. But with the EVM stack statements known above, let’s try to decode 1 paragraph in Example.opcode
file
1 2 | PUSH1 0x60 PUSH1 0x40 MSTORE CALLVALUE |
- 2 PUSH1 statements to add 1 byte to the top of the stack respectively
0x60
and0x40
- The MSTORE command will empty the stack, the value
0x60
will be stored in the memory0x40
. The MSTORE command takes 2 arguments, and it takes 2 values on top of the stack (The first argument is the device address and the second argument is the value). - CALLVALUE pushes to the top of the stack the amount of wei whose address the contract is deployed, at TH the amount of wei is 0