Everything you need to know about semicolon insertion in JavaScript

Tram Ho

Preamble

  • Automatic semicolon insertion is one of the most controversial syntax features of JavaScript. There are also many misconceptions surrounding it.
  • Some JavaScript programmers use the semicolon at the end of each statement and some use them only when strictly required. Most or some programmers add semicolons as a matter of style.
  • Even if you use the semicolon at the end of each statement, there are still some parsing structures in ambiguous ways. Regardless of whether you prefer to add semicolons, you need to know the rules for writing JavaScript professionally. All the rules will be explained in this article, you will be able to understand the parsing of any program you encounter. After reading this article, I hope you will become an expert in automatic semicolon insertion JavaScript or ASI (automatic semicolon insertion).

Where semicolon is allowed

  • In the linguistic grammar format given in the ECMAScript specification, semicolons are displayed at the end of each sentence type that they may appear. Example of the following do-white statement:

  • The semicolon also appears in grammar at the end of the var statement, the expression statement (such as “4 + 4;” or “f ();” ), the continue , return , break , throw and debugger statements .
  • The empty statement is a semicolon, and is a valid statement in JavaScript. For this reason, “;;;” is a valid JavaScript program. It parses as three empty statements and runs by doing nothing three times.
  • Sometimes empty statements are really useful, at least syntactically. For example, to write an infinite loop, one could write while (1); , in which the semicolon is parsed as an empty statement, making the while statement syntactically valid. If the semicolon is omitted, the while statement will not complete, because a statement that follows the loop condition is required.
  • Finally, semicolons appear in the loop:

and of course, they can appear inside strings and regular expressions.

Where the semicolon can be ignored

  • In the grammar format used in the ECMAScript specification, semicolons are included, as described above. However, the specification then introduced rules that describe how the actual parsing differs from formal grammar.
  • This section outlines three basic rules, followed by two exceptions. The rules are:
  1. When the program contains tokens that are not in grammatical format, the semicolon is inserted if (a) there are line breaks at the time or (b) the unwanted token is in curly braces
  2. At the end of the file, if the program cannot be parsed, then a semicolon is inserted.
  3. When encountering “restricted productions” containing a line end in the place where “no LineTerminator here” is located , then a semicolon is inserted.
  • These rules state that a statement can be terminated without a semicolon (a) before the closing brace, (b) at the end of the program, or (c) when the next token cannot be parsed. syntax.
  • The exceptions are semicolons that are never inserted as part of the for loop header:

and semicolons should never be inserted if it is parsed as a blank statement.

  • 42; “hello!” is a valid program, just like 42 n “hello!” (with ” n” representing an actual line break), but 42 “hello!” do not. Line breaks automatically insert semicolons but spaces are not. “if (x) {y ()}” is also valid. Here “y ()” is an expression statement, which can be terminated with semicolons, but since next tokens are curly braces, semicolons are optional even though there are no line breaks.
  • Two exceptions, for empty loops and statements, can be proved together:

This for loop will repeat the parenting of a node until it meets a node without a parent. All of this is done in the for loop header, so we have nothing left for the statement inside the for loop to do. However, the for loop syntax requires a statement, so we use an empty statement. Although all three semicolons in this example appear at the end of the line, all three dots are required, because the semicolon is never inserted into the loop header or to create a blank statement.

Restricted Productions

Introduce

Restricted productions is after it the line break does not appear, if the line break appears there, it will prevent the program from executing in the inherent way, although it can still run in another way.

Classify

  • There are five types of restricted productions , they are the postfix ++ and operators, the continue, break, return, throw statements . The break and continue statements are used to end or continue a loop with a specific label following it. If there is a label following it, it must be on the same line as the break or continue statement . The following is a valid program:

getc () will read a character from an input device and return it, and the program will read those characters, checking each character for whether it is in an array of quitchars, or if it will end the loop. . Because the break statement has a label charloop , it exits the while loop, not just the for loop inside.

  • The following program, which differs only in whitespace, will parse it differently and will not produce similar results:

Specifically, the label charloop is not part of the break statement. So a semicolon is automatically inserted after the break ended inner loop, also charloop only be parsed as reference variables allow charloop, will not be achieved. And while loop will run indefinitely.

  • Here are examples that illustrate the other four restricted productions :

This is a syntax error, it will not parse into “i ++” . A line end cannot appear before the up or down postfix operators, so “++” or “-“ at the beginning of a line will never parse a part of the previous line.


This is not a syntax error, it parses as “i; ++ j” . The operators “++” or “-“ with the end of the line after it are not affected, they are still parsed with the expression that they modified.


This code parses as an empty return statement, followed by an expression statement that will never be reached. Here is the code to achieve the following return statement:


Note that return statements may contain line breaks in the expression, not between the return code and the beginning of the expression. When the semicolon is omitted automatically, it is convenient because it allows the programmer to write an empty return statement without accidentally returning the value of the next line:

The continue and throw statement is similar to break and return :

Note that indentation has no effect in analyzing ECMAScript programs, but the presence or absence of line breaks is. Therefore, any tool that handles JavaScript source code can remove leading spaces in lines (except in strings) without changing the semantics of the program, but line breaks cannot be indiscriminately replaced or replaced. with spaces or semicolons.

Common mistake

  • The most common mistake a programmer makes is to place the return value on the back line of the return statement, which is especially common when the value returned is a large object or string or multi-line string. Line breaks with postfix, break , continue and throw operators are rarely seen in practice, for the simple reason that the wrong line breaks seem unnatural to most programmers and therefore are not ability to be written.

Attention

  • The final sophistication of ASI arises from the first rule, requiring the program to contain tokens that are not grammatically allowed, before the semicolon will be inserted. When writing code with the optional semicolon omitted, it is important to keep in mind this rule so that the required semicolon is not accidentally omitted. This rule is what makes it possible to extend statements across multiple lines, as in the following examples:

The rule only looks at the first code of the following line. If the code can parse as part of the statement, the statement will continue. If the first code cannot extend the statement, a new statement will start (then the semicolon is inserted automatically as specified in the specification).

  • The possibility of an error whenever there is a pair of A and B statements where both A and B are valid stand alone statements, but the first code of B can also be accepted as an extension of A. In such cases, if a semicolon is not provided, the parser will not parse B as a separate statement and will reject the program or parse the way it was created. Unwanted program. Therefore, when the semicolon is omitted, the programmer must be careful with any pair of statements separated by line breaks such as:

For example, the following code snippet will produce unexpected results if semicolon is missing:

will equal to:

  • The specification goes on to state: “In case the assignment statement must start with a left bracket, the programmer should provide a clear semicolon at the end of the previous statement instead of relying on an automatic semicolon. a more powerful alternative where the semicolon is intentionally omitted is to include a semicolon at the beginning of the line, right before the code to create a potential ambiguity:

  • The last tricky piece of code is the slash and this code can produce erroneous results:

On lines 1-3, we create and assign a number of variables and on line 4, we construct a regular expression / [az] / g that will match any character from az, and then we evaluate This regular expression with the string s using the exec method. Because the return value of exec () is not used, this code is not very useful, but one can expect it to compile. However, the slash can not only appear at the beginning of a regular expression, but also act as a division operator. That means the leading slash on line four will actually be parsed as a continuation of the statement assigned on the previous line. The entire lines three and four are analyzed in the form of a “i equals 0 divided by [az] divided by g.exec (s)”.

Wrong nontion

  • Many new JavaScript programmers have been advised to use semicolons everywhere and hope that if they don’t intentionally use the semicolon insertion rule, they can ignore the existence of all language features. this. Because restricted productions described above, it’s worth noting that the return statement, when aware of them, developers can then become overly alert with line breaks and avoid using them even if they will. Increase the clarity of the code. Ideally, you should be familiar with all ASI rules so that you can read any piece of code regardless of how it is written and write the code as clearly as possible.
  • Another problem is that there is no reason to worry about browser compatibility regarding semicolon insertion: all browsers follow the same rule and they are provided rules. by the ECMAScript specification and explained above.

Conclude

  • Should you remove the semicolon option? The answer depends on your personal preferences, but should be done on the basis of informed choice instead of vague concerns about unknown syntax traps or non-existent browser errors. If you remember the rules given here, you’re well equipped to make your own choices and read any JavaScript code easily.
  • If you choose to remove semicolons if possible, my advice is to insert them right before the opening parenthesis or square brackets in any statement that begins with a certain token or any code that begins with one of the arithmetic operators “/” , “+” or “-“ .
  • Whether you omit the semicolon or not, you must remember restricted productions ( return, break, continue, throw and postfix operators), and you should feel free to use line breaks everywhere else to improve your ability. Read your code.
  • Good luck! Reference source: http://inimino.org/~inimino/blog/javascript_semicolons
Share the news now

Source : Viblo