Constraints

A .dex file is the transport format for Dalvik bytecode. There are certain syntactical and semantical constraints for a file to be a valid .dex file, and a runtime is required to support only valid .dex files.

General .dex integrity constraints

General integrity constraints are concerned with the larger structure of a .dex file, as described in detail in .dex format.

IdentifierDescription
G1The magic number of the .dex file must be dex\n035\0 for version 35, or similar for later versions.
G2The checksum must be an Adler-32 checksum of the whole file contents except magic and checksum field.
G3The signature must be a SHA-1 hash of the whole file contents except magic, checksum, and signature.
G4

The file_size must match the actual file size in bytes. (v40 or earlier)

The file_size must point to the next header in the container, or at the end of the pysical file (the container). If it points to the next header, the file size must be 4-byte aligned. The sum of all file_size fields must equal container_size. (v41 or later)

G5

The header_size must have the value: 0x70 (v40 or earlier)

The header_size must have the value: 0x78 (v41 or later)

G6The endian_tag must have either the value: ENDIAN_CONSTANT or REVERSE_ENDIAN_CONSTANT
G7

For each of the link, string_ids, type_ids, proto_ids, field_ids, method_ids, class_defs, and data sections, the offset and size fields must be both zero or both non-zero. In the latter case, the offset must be four-byte-aligned.

The offset and size fields must be within the container and refer to data that is located after the header that defines them. (v41 or later)

G8All offset fields in the header except map_off must be four-byte-aligned.
G9The map_off field must be either zero or point into the data section. In the latter case, the data section must exist.
G10None of the link, string_ids, type_ids, proto_ids, field_ids, method_ids, class_defs and data sections must overlap each other or the header.
G11If a map exists, then each map entry must have a valid type. Each type may appear at most once.
G12If a map exists, then each map entry must have a non-zero offset and size. The offset must point into the corresponding section of the file (i.e. a string_id_item must point into the string_ids section) and the explicit or implicit size of the item must match the actual contents and size of the section.
G13If a map exists, then the offset of map entry n+1 must be greater or equal to the offset of map entry n plus than size of map entry n. This implies non-overlapping entries and low-to-high ordering.
G14The following types of entries must have an offset that is four-byte-aligned: string_id_item, type_id_item, proto_id_item, field_id_item, method_id_item, class_def_item, type_list, code_item, annotations_directory_item.
G15

For each string_id_item, the string_data_off field must contain a valid reference into the data section. (v40 or earlier)

For each string_id_item, the string_data_off field must be an offset within the container and after any header that trasitively uses it. (v41 or later)

For the referenced string_data_item, the data field must contain a valid MUTF-8 string, and the utf16_size must match the decoded length of the string.

G16For each type_id_item, the descriptor_idx field must contain a valid reference into the string_ids list. The referenced string must be a valid type descriptor.
G17For each proto_id_item, the shorty_idx field must contain a valid reference into the string_ids list. The referenced string must be a valid shorty descriptor. Also, the return_type_idx field must be a valid index into the type_ids section, and the parameters_off field must be either zero or a valid offset pointing into the data section. If non-zero, the parameter list must not contain any void entries.
G18For each field_id_item, both the class_idx and type_idx fields must be valid indices into the type_ids list. The entry referenced by class_idx must be a non-array reference type. In addition, the name_idx field must be a valid reference into the string_ids section, and the contents of the referenced entry must conform to the MemberName specification.
G19For each method_id_item, the class_idx field must be a valid index into the type_ids section, and the referenced entry must be a non-array reference type. The proto_id field must be a valid reference into the proto_ids list. The name_idx field must be a valid reference into the string_ids section, and the contents of the referenced entry must conform to the MemberName specification.
G20For each field_id_item, the class_idx field must be a valid index into the type_ids list. The referenced entry must be a non-array reference type.

Static bytecode constraints

Static constraints are constraints on individual elements of the bytecode. They usually can be checked without employing control or data-flow analysis techniques.

IdentifierDescription
A1The insns array must not be empty.
A2The first opcode in the insns array must have index zero.
A3The insns array must contain only valid Dalvik opcodes.
A4The index of instruction n+1 must equal the index of instruction n plus the length of instruction n, taking into account possible operands.
A5The last instruction in the insns array must end at index insns_size-1.
A6All goto and if-<kind> targets must be opcodes within the same method.
A7All targets of a packed-switch instruction must be opcodes within the same method. The size and the list of targets must be consistent.
A8All targets of a sparse-switch instruction must be opcodes within the same method. The corresponding table must be consistent and sorted low-to-high.
A9The B operand of the const-string and const-string/jumbo instructions must be a valid index into the string constant pool.
A10The C operand of the iget<kind> and iput<kind> instructions must be a valid index into the field constant pool. The referenced entry must represent an instance field.
A11The C operand of the sget<kind> and sput<kind> instructions must be a valid index into the field constant pool. The referenced entry must represent a static field.
A12The C operand of the invoke-virtual, invoke-super, invoke-direct and invoke-static instructions must be a valid index into the method constant pool.
A13The B operand of the invoke-virtual/range, invoke-super/range, invoke-direct/range, and invoke-static/range instructions must be a valid index into the method constant pool.
A14A method the name of which starts with a '<' must only be invoked implicitly by the VM, not by code originating from a .dex file. The only exception is the instance initializer, which may be invoked by invoke-direct.
A15The C operand of the invoke-interface instruction must be a valid index into the method constant pool. The referenced method_id must belong to an interface (not a class).
A16The B operand of the invoke-interface/range instruction must be a valid index into the method constant pool. The referenced method_id must belong to an interface (not a class).
A17The B operand of the const-class, check-cast, new-instance, and filled-new-array/range instructions must be a valid index into the type constant pool.
A18The C operand of the instance-of, new-array, and filled-new-array instructions must be a valid index into the type constant pool.
A19The dimensions of an array created by a new-array instruction must be less than 256.
A20The new instruction must not refer to array classes, interfaces, or abstract classes.
A21The type referred to by a new-array instruction must be a valid, non-reference type.
A22All registers referred to by an instruction in a single-width (non-pair) fashion must be valid for the current method. That is, their indices must be non-negative and smaller than registers_size.
A23All registers referred to by an instruction in a double-width (pair) fashion must be valid for the current method. That is, their indices must be non-negative and smaller than registers_size-1.
A24The method_id operand of the invoke-virtual and invoke-direct instructions must belong to a class (not an interface). In Dex files prior to version 037 the same must be true of invoke-super and invoke-static instructions.
A25The method_id operand of the invoke-virtual/range and invoke-direct/range instructions must belong to a class (not an interface). In Dex files prior to version 037 the same must be true of invoke-super/range and invoke-static/range instructions.

Structural bytecode constraints

Structural constraints are constraints on relationships between several elements of the bytecode. They usually can't be checked without employing control or data-flow analysis techniques.

IdentifierDescription
B1The number and types of arguments (registers and immediate values) must always match the instruction.
B2Register pairs must never be broken up.
B3A register (or pair) has to be assigned first before it can be read.
B4An invoke-direct instruction must invoke an instance initializer or a method only in the current class or one of its superclasses.
B5An instance initializer must be invoked only on an uninitialized instance.
B6Instance methods may be invoked only on and instance fields may only be accessed on already initialized instances.
B7A register that holds the result of a new-instance instruction must not be used if the same new-instance instruction is again executed before the instance is initialized.
B8An instance initializer must call another instance initializer (same class or superclass) before any instance members can be accessed. Exceptions are non-inherited instance fields, which can be assigned before calling another initializer, and the Object class in general.
B9All actual method arguments must be assignment-compatible with their respective formal arguments.
B10For each instance method invocation, the actual instance must be assignment-compatible with the class or interface specified in the instruction.
B11A return<kind> instruction must match its method's return type.
B12When accessing protected members of a superclass, the actual type of the instance being accessed must be either the current class or one of its subclasses.
B13The type of a value stored into a static field must be assignment-compatible with or convertible to the field's type.
B14The type of a value stored into a field must be assignment-compatible with or convertible to the field's type.
B15The type of every value stored into an array must be assignment-compatible with the array's component type.
B16The A operand of a throw instruction must be assignment-compatible with java.lang.Throwable.
B17The last reachable instruction of a method must either be a backwards goto or branch, a return, or a throw instruction. It must not be possible to leave the insns array at the bottom.
B18The unassigned half of a former register pair may not be read (is considered invalid) until it has been re-assigned by some other instruction.
B19A move-result<kind> instruction must be immediately preceded (in the insns array) by an invoke-<kind> instruction. The only exception is the move-result-object instruction, which may also be preceded by a filled-new-array instruction.
B20A move-result<kind> instruction must be immediately preceded (in actual control flow) by a matching return-<kind> instruction (it must not be jumped to). The only exception is the move-result-object instruction, which may also be preceded by a filled-new-array instruction.
B21A move-exception instruction must appear only as the first instruction in an exception handler.
B22The packed-switch-data, sparse-switch-data, and fill-array-data pseudo-instructions must not be reachable by control flow.