A .dex
file is the transport format for Dalvik bytecode. There are certain syntactical and semantical constraints for a file to be a valid .dex
file, and a runtime is required to support only valid .dex files.
General .dex integrity constraints
General integrity constraints are concerned with the larger structure of a .dex
file, as described in detail in .dex
format.
Identifier | Description |
---|---|
G1 | The magic number of the .dex file must be dex\n035\0 for version 35, or similar for later versions. |
G2 | The checksum must be an Adler-32 checksum of the whole file contents except magic and checksum field. |
G3 | The signature must be a SHA-1 hash of the whole file contents except magic , checksum , and signature . |
G4 | The The |
G5 | The The |
G6 | The endian_tag must have either the value: ENDIAN_CONSTANT or REVERSE_ENDIAN_CONSTANT |
G7 | For each of the The |
G8 | All offset fields in the header except map_off must be four-byte-aligned. |
G9 | The map_off field must be either zero or point into the data section. In the latter case, the data section must exist. |
G10 | None of the link , string_ids , type_ids , proto_ids , field_ids , method_ids , class_defs and data sections must overlap each other or the header. |
G11 | If a map exists, then each map entry must have a valid type. Each type may appear at most once. |
G12 | If a map exists, then each map entry must have a non-zero offset and size. The offset must point into the corresponding section of the file (i.e. a string_id_item must point into the string_ids section) and the explicit or implicit size of the item must match the actual contents and size of the section. |
G13 | If a map exists, then the offset of map entry n+1 must be greater or equal to the offset of map entry n plus than size of map entry n . This implies non-overlapping entries and low-to-high ordering. |
G14 | The following types of entries must have an offset that is four-byte-aligned: string_id_item , type_id_item , proto_id_item , field_id_item , method_id_item , class_def_item , type_list , code_item , annotations_directory_item . |
G15 | For each For each For the referenced |
G16 | For each type_id_item , the descriptor_idx field must contain a valid reference into the string_ids list. The referenced string must be a valid type descriptor. |
G17 | For each proto_id_item , the shorty_idx field must contain a valid reference into the string_ids list. The referenced string must be a valid shorty descriptor. Also, the return_type_idx field must be a valid index into the type_ids section, and the parameters_off field must be either zero or a valid offset pointing into the data section. If non-zero, the parameter list must not contain any void entries. |
G18 | For each field_id_item , both the class_idx and type_idx fields must be valid indices into the type_ids list. The entry referenced by class_idx must be a non-array reference type. In addition, the name_idx field must be a valid reference into the string_ids section, and the contents of the referenced entry must conform to the MemberName specification. |
G19 | For each method_id_item , the class_idx field must be a valid index into the type_ids section, and the referenced entry must be a non-array reference type. The proto_id field must be a valid reference into the proto_ids list. The name_idx field must be a valid reference into the string_ids section, and the contents of the referenced entry must conform to the MemberName specification. |
G20 | For each field_id_item , the class_idx field must be a valid index into the type_ids list. The referenced entry must be a non-array reference type. |
Static bytecode constraints
Static constraints are constraints on individual elements of the bytecode. They usually can be checked without employing control or data-flow analysis techniques.
Identifier | Description |
---|---|
A1 | The insns array must not be empty. |
A2 | The first opcode in the insns array must have index zero. |
A3 | The insns array must contain only valid Dalvik opcodes. |
A4 | The index of instruction n+1 must equal the index of instruction n plus the length of instruction n , taking into account possible operands. |
A5 | The last instruction in the insns array must end at index insns_size-1 . |
A6 | All goto and if-<kind> targets must be opcodes within the same method. |
A7 | All targets of a packed-switch instruction must be opcodes within the same method. The size and the list of targets must be consistent. |
A8 | All targets of a sparse-switch instruction must be opcodes within the same method. The corresponding table must be consistent and sorted low-to-high. |
A9 | The B operand of the const-string and const-string/jumbo instructions must be a valid index into the string constant pool. |
A10 | The C operand of the iget<kind> and iput<kind> instructions must be a valid index into the field constant pool. The referenced entry must represent an instance field. |
A11 | The C operand of the sget<kind> and sput<kind> instructions must be a valid index into the field constant pool. The referenced entry must represent a static field. |
A12 | The C operand of the invoke-virtual , invoke-super , invoke-direct and invoke-static instructions must be a valid index into the method constant pool. |
A13 | The B operand of the invoke-virtual/range , invoke-super/range , invoke-direct/range , and invoke-static/range instructions must be a valid index into the method constant pool. |
A14 | A method the name of which starts with a '<' must only be invoked implicitly by the VM, not by code originating from a .dex file. The only exception is the instance initializer, which may be invoked by invoke-direct . |
A15 | The C operand of the invoke-interface instruction must be a valid index into the method constant pool. The referenced method_id must belong to an interface (not a class). |
A16 | The B operand of the invoke-interface/range instruction must be a valid index into the method constant pool. The referenced method_id must belong to an interface (not a class). |
A17 | The B operand of the const-class , check-cast , new-instance , and filled-new-array/range instructions must be a valid index into the type constant pool. |
A18 | The C operand of the instance-of , new-array , and filled-new-array instructions must be a valid index into the type constant pool. |
A19 | The dimensions of an array created by a new-array instruction must be less than 256 . |
A20 | The new instruction must not refer to array classes, interfaces, or abstract classes. |
A21 | The type referred to by a new-array instruction must be a valid, non-reference type. |
A22 | All registers referred to by an instruction in a single-width (non-pair) fashion must be valid for the current method. That is, their indices must be non-negative and smaller than registers_size . |
A23 | All registers referred to by an instruction in a double-width (pair) fashion must be valid for the current method. That is, their indices must be non-negative and smaller than registers_size-1 . |
A24 | The method_id operand of the invoke-virtual and invoke-direct instructions must belong to a class (not an interface). In Dex files prior to version 037 the same must be true of invoke-super and invoke-static instructions. |
A25 | The method_id operand of the invoke-virtual/range and invoke-direct/range instructions must belong to a class (not an interface). In Dex files prior to version 037 the same must be true of invoke-super/range and invoke-static/range instructions. |
Structural bytecode constraints
Structural constraints are constraints on relationships between several elements of the bytecode. They usually can't be checked without employing control or data-flow analysis techniques.
Identifier | Description |
---|---|
B1 | The number and types of arguments (registers and immediate values) must always match the instruction. |
B2 | Register pairs must never be broken up. |
B3 | A register (or pair) has to be assigned first before it can be read. |
B4 | An invoke-direct instruction must invoke an instance initializer or a method only in the current class or one of its superclasses. |
B5 | An instance initializer must be invoked only on an uninitialized instance. |
B6 | Instance methods may be invoked only on and instance fields may only be accessed on already initialized instances. |
B7 | A register that holds the result of a new-instance instruction must not be used if the same new-instance instruction is again executed before the instance is initialized. |
B8 | An instance initializer must call another instance initializer (same class or superclass) before any instance members can be accessed. Exceptions are non-inherited instance fields, which can be assigned before calling another initializer, and the Object class in general. |
B9 | All actual method arguments must be assignment-compatible with their respective formal arguments. |
B10 | For each instance method invocation, the actual instance must be assignment-compatible with the class or interface specified in the instruction. |
B11 | A return<kind> instruction must match its method's return type. |
B12 | When accessing protected members of a superclass, the actual type of the instance being accessed must be either the current class or one of its subclasses. |
B13 | The type of a value stored into a static field must be assignment-compatible with or convertible to the field's type. |
B14 | The type of a value stored into a field must be assignment-compatible with or convertible to the field's type. |
B15 | The type of every value stored into an array must be assignment-compatible with the array's component type. |
B16 | The A operand of a throw instruction must be assignment-compatible with java.lang.Throwable . |
B17 | The last reachable instruction of a method must either be a backwards goto or branch, a return , or a throw instruction. It must not be possible to leave the insns array at the bottom. |
B18 | The unassigned half of a former register pair may not be read (is considered invalid) until it has been re-assigned by some other instruction. |
B19 | A move-result<kind> instruction must be immediately preceded (in the insns array) by an invoke-<kind> instruction. The only exception is the move-result-object instruction, which may also be preceded by a filled-new-array instruction. |
B20 | A move-result<kind> instruction must be immediately preceded (in actual control flow) by a matching return-<kind> instruction (it must not be jumped to). The only exception is the move-result-object instruction, which may also be preceded by a filled-new-array instruction. |
B21 | A move-exception instruction must appear only as the first instruction in an exception handler. |
B22 | The packed-switch-data , sparse-switch-data , and fill-array-data pseudo-instructions must not be reachable by control flow. |