未验证 提交 f6f52a2b 编写于 作者: R Robert Fancsik 提交者: GitHub

Update the internals documentation (#2923)

JerryScript-DCO-1.0-Signed-off-by: Robert Fancsik frobert@inf.u-szeged.hu
上级 de71fe4f
......@@ -23,7 +23,7 @@ Expression parser is responsible for parsing JavaScript expressions. It is imple
JavaScript statements are parsed by this component. It uses the [Expression parser](#expression-parser) to parse the constituent expressions. The implementation of Statement parser is located in `./jerry-core/parser/js/js-parser-statm.c`.
Function `parser_parse_source` carries out the parsing and compiling of the input EcmaScript source code. When a function appears in the source `parser_parse_source` calls `parser_parse_function` which is responsible for processing the source code of functions recursively including argument parsing and context handling. After the parsing, function `parser_post_processing` dumps the created opcodes and returns an `ecma_compiled_code_t*` that points to the compiled bytecode sequence.
Function `parser_parse_source` carries out the parsing and compiling of the input ECMAScript source code. When a function appears in the source `parser_parse_source` calls `parser_parse_function` which is responsible for processing the source code of functions recursively including argument parsing and context handling. After the parsing, function `parser_post_processing` dumps the created opcodes and returns an `ecma_compiled_code_t*` that points to the compiled bytecode sequence.
The interactions between the major components shown on the following figure.
......@@ -33,19 +33,19 @@ The interactions between the major components shown on the following figure.
This section describes the compact byte-code (CBC) representation. The key focus is reducing memory consumption of the byte-code representation without sacrificing considerable performance. Other byte-code representations often focus on performance only so inventing this representation is an original research.
CBC is a CISC like instruction set which assigns shorter instructions for frequent operations. Many instructions represent multiple atomic tasks which reduces the byte code size. This technique is basically a data compression method.
CBC is a CISC like instruction set which assigns shorter instructions for frequent operations. Many instructions represent multiple atomic tasks which reduces the bytecode size. This technique is basically a data compression method.
## Compiled Code Format
The memory layout of the compiled byte code is the following.
The memory layout of the compiled bytecode is the following.
![CBC layout](img/CBC_layout.png)
The header is a `cbc_compiled_code` structure with several fields. These fields contain the key properties of the compiled code.
The literals part is an array of ecma values. These values can contain any EcmaScript value types, e.g. strings, numbers, function and regexp templates. The number of literals is stored in the `literal_end` field of the header.
The literals part is an array of ecma values. These values can contain any ECMAScript value types, e.g. strings, numbers, functions and regexp templates. The number of literals is stored in the `literal_end` field of the header.
CBC instruction list is a sequence of byte code instructions which represents the compiled code.
CBC instruction list is a sequence of bytecode instructions which represents the compiled code.
## Byte-code Format
......@@ -55,7 +55,7 @@ The memory layout of a byte-code is the following:
Each byte-code starts with an opcode. The opcode is one byte long for frequent and two byte long for rare instructions. The first byte of the rare instructions is always zero (`CBC_EXT_OPCODE`), and the second byte represents the extended opcode. The name of common and rare instructions start with `CBC_` and `CBC_EXT_` prefix respectively.
The maximum number of opcodes is 511, since 255 common (zero value excluded) and 256 rare instructions can be defined. Currently around 230 frequent and 120 rare instructions are available.
The maximum number of opcodes is 511, since 255 common (zero value excluded) and 256 rare instructions can be defined. Currently around 215 frequent and 70 rare instructions are available.
There are three types of bytecode arguments in CBC:
......@@ -63,7 +63,7 @@ There are three types of bytecode arguments in CBC:
* __literal argument__: An integer index which is greater or equal than zero and less than the `literal_end` field of the header. For further information see next section Literals (next).
* __relative branch__: An 1-3 byte long offset. The branch argument might also represent the end of an instruction range. For example the branch argument of `CBC_EXT_WITH_CREATE_CONTEXT` shows the end of a `with` statement. More precisely the position after the last instruction.
* __relative branch__: An 1-3 byte long offset. The branch argument might also represent the end of an instruction range. For example the branch argument of `CBC_EXT_WITH_CREATE_CONTEXT` shows the end of a `with` statement. More precisely the position after the last instruction in the with clause.
Argument combinations are limited to the following seven forms:
......@@ -137,12 +137,12 @@ Byte-codes of this category serve for placing objects onto the stack. As there a
<span class="CSSTableGenerator" markdown="block">
| byte-code | description |
| --------------------- | ---------------------------------------------------- |
| CBC_PUSH_LITERAL | Pushes the value of the given literal argument. |
| CBC_PUSH_TWO_LITERALS | Pushes the value of the given two literal arguments. |
| CBC_PUSH_UNDEFINED | Pushes an undefined value. |
| CBC_PUSH_TRUE | Pushes a logical true. |
| byte-code | description |
| --------------------- | ----------------------------------------------------- |
| CBC_PUSH_LITERAL | Pushes the value of the given literal argument. |
| CBC_PUSH_TWO_LITERALS | Pushes the values of the given two literal arguments. |
| CBC_PUSH_UNDEFINED | Pushes an undefined value. |
| CBC_PUSH_TRUE | Pushes a logical true. |
| CBC_PUSH_PROP_LITERAL | Pushes a property whose base object is popped from the stack, and the property name is passed as a literal argument. |
</span>
......@@ -196,7 +196,7 @@ Branch byte-codes are used to perform conditional and unconditional jumps in the
| CBC_JUMP_BACKWARD | Jumps backward by the 1 byte long relative offset argument. |
| CBC_JUMP_BACKWARD_2 | Jumps backward by the 2 byte long relative offset argument. |
| CBC_JUMP_BACKWARD_3 | Jumps backward by the 3 byte long relative offset argument. |
| CBC_BRANCH_IF_TRUE_FORWARD | Jumps if the value on the top of the stack is true by the 1 byte long relative offset argument. |
| CBC_BRANCH_IF_TRUE_FORWARD | Jumps forward if the value on the top of the stack is true by the 1 byte long relative offset argument. |
</span>
......@@ -219,12 +219,14 @@ ECMA component of the engine is responsible for the following notions:
## Data Representation
The major structure for data representation is `ECMA_value`. The lower two bits of this structure encode value tag, which determines the type of the value:
The major structure for data representation is `ECMA_value`. The lower three bits of this structure encode value tag, which determines the type of the value:
* simple
* number
* string
* object
* symbol
* error
![ECMA value representation](img/ecma_value.png)
......@@ -275,7 +277,6 @@ The objects are represented as following structure:
* Reference counter - number of hard (non-property) references
* Next object pointer for the garbage collector
* GC's visited flag
* type (function object, lexical environment, etc.)
### Properties of Objects
......@@ -323,7 +324,7 @@ Collections are array-like data structures, which are optimized to save memory.
### Exception Handling
In order to implement a sense of exception handling, the return values of JerryScript functions are able to indicate their faulty or "exceptional" operation. The return values are actually ECMA values (see section [Data Representation](#data-representation)) in which the error bit is set if an erroneous operation is occurred.
In order to implement a sense of exception handling, the return values of JerryScript functions are able to indicate their faulty or "exceptional" operation. The return values are ECMA values (see section [Data Representation](#data-representation)) and if an erroneous operation occurred the ECMA_VALUE_ERROR simple value is returned.
### Value Management and Ownership
......
docs/img/ecma_object.png

27.0 KB | W: | H:

docs/img/ecma_object.png

29.5 KB | W: | H:

docs/img/ecma_object.png
docs/img/ecma_object.png
docs/img/ecma_object.png
docs/img/ecma_object.png
  • 2-up
  • Swipe
  • Onion skin
docs/img/ecma_value.png

7.0 KB | W: | H:

docs/img/ecma_value.png

6.2 KB | W: | H:

docs/img/ecma_value.png
docs/img/ecma_value.png
docs/img/ecma_value.png
docs/img/ecma_value.png
  • 2-up
  • Swipe
  • Onion skin
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册