本文描述的 ASM 指的是 OW2 ASM
ASM-Core 的结构
首先是一些概述性的内容。
由于 ASM 操作的 JAVA 字节码有严格的格式规定,且即使随着 JVM 标准的升级也极少出现重大调整。 因此适用面狭窄的访问者模式在该项目中被大量地使用,并且已经到了丧心病狂的程度:)
从核心包声明的类来看,主要包括:
ClassReader - 作为结构化对象,将接收(accept)访问者的访问
几种访问者抽象类以及相应的实现类
- AnnotationVisitor -> AnnotationWriter
- ClassVisitor -> ClassWriter
- FieldVisitor -> FieldWriter
- MethodVisitor -> MethodWriter
- ModuleVisitor -> ModuleWriter
Opcodes & Constants - ClassFile 中描述的大量常量符号与值
其它一些辅助的类
- Attribute - 用于处理非标准化的属性(ClassFile 允许JVMS 中未定义的 Attribute)
- ByteArray - 动态可自适应的 byte[] (字节数组)
- Context - ClassReader 在被解析(被访问)过程中用于表示“累积状态”的一个类/对象
- Symbol - 用于表示 ClassFile 中描述的 Constant 的基类
- SymbolTable - 用于存储常量池对象
- 其它内容省略
ClassFile 文件格式
本节的内容可以参阅 ClassFile 文件格式
ClassFile 是 ASM 操作字节码的基础与前提。在 JVMS 定义了 .class 文件格式之后,ASM 才在此基础上进行了对 ClassFile 的字节码操作。
因此,无论如何,个人认为在真正开始了解 ASM 之前,通读一遍 ClassFile 文件格式是完全有必要的。并且,在基本了解 ClassFile 内容的基础上, 尝试对某个 .class 文件进行手工解析也不失为一种加深理解的途径。
Visitor Pattern
由于 ASM 的实现采用了偏门的访问者模式。因此,了解访问者模式也是一个必不可少的重要环节。同时在了解后也将为源码阅读者提供更清晰的解读思路。
《设计模式:可复用面向对象软件的基础》一书的“5.11 VISITOR(访问者)——对象行为型模式”提供了很详尽的解释。
本人对这方面的不甚了解,推荐自行查找资料。
ClassReader
ClassReader 是解析现有的 .class 文件的基本工具。同时,ClassReader 也作为一个内容的持有者,与 SymboleTable 配合,来满足访问者基本的访问需求。
刨除大量的内部私有方法,ClassReader 对外开放的接口相当简单。最核心的方法仅包括 ClassReader(…) 以及 accetp(…)
ClassReader(…) 构造方法
顾名思义,构造方法用于实例化 ClassReader 对象,包括一些必须的变量的初始化。
在构造函数中完成的初始化内容包括:
- 校验版本号
- 存储每个常量池项目的起始偏移量 cpInfoOffsets
- 存储每个引导方法的起始偏移量 bootstrapMethodOffsets
- 存储最长字符串常量的大小 maxStringLength
ClassReader(final byte[] classFileBuffer, final int classFileOffset, final boolean checkClassVersion) {
this.b = classFileBuffer; // .class 文件缓存
// 检查主版本号, 第6,7个字节(从0字节开始计数)
if (checkClassVersion && readShort(classFileOffset + 6) > Opcodes.V11) {
throw new IllegalArgumentException(
"Unsupported class file major version " + readShort(classFileOffset + 6));
}
// 创建常量池数组,常量池长度的定义 constant_pool_count 在第8,9字节
int constantPoolCount = readUnsignedShort(classFileOffset + 8); // 读取无符号short, 即读取连续两字节作为一个short值
cpInfoOffsets = new int[constantPoolCount]; // 每个常量的偏移位置
cpInfoValues = new Object[constantPoolCount]; // 每个常量的实例对象
int currentCpInfoIndex = 1;
int currentCpInfoOffset = classFileOffset + 10;
int currentMaxStringLength = 0; // 最长字符串常量
while (currentCpInfoIndex < constantPoolCount) {
cpInfoOffsets[currentCpInfoIndex++] = currentCpInfoOffset + 1;
int cpInfoSize;
switch (classFileBuffer[currentCpInfoOffset]) {
case Symbol.CONSTANT_FIELDREF_TAG:
case Symbol.CONSTANT_METHODREF_TAG:
case Symbol.CONSTANT_INTERFACE_METHODREF_TAG:
case Symbol.CONSTANT_INTEGER_TAG:
case Symbol.CONSTANT_FLOAT_TAG:
case Symbol.CONSTANT_NAME_AND_TYPE_TAG:
case Symbol.CONSTANT_INVOKE_DYNAMIC_TAG:
case Symbol.CONSTANT_DYNAMIC_TAG:
cpInfoSize = 5;
break;
case Symbol.CONSTANT_LONG_TAG:
case Symbol.CONSTANT_DOUBLE_TAG:
cpInfoSize = 9;
currentCpInfoIndex++;
break;
case Symbol.CONSTANT_UTF8_TAG:
cpInfoSize = 3 + readUnsignedShort(currentCpInfoOffset + 1);
if (cpInfoSize > currentMaxStringLength) {
// The size in bytes of this CONSTANT_Utf8 structure provides a conservative estimate
// of the length in characters of the corresponding string, and is much cheaper to
// compute than this exact length.
currentMaxStringLength = cpInfoSize;
}
break;
case Symbol.CONSTANT_METHOD_HANDLE_TAG:
cpInfoSize = 4;
break;
case Symbol.CONSTANT_CLASS_TAG:
case Symbol.CONSTANT_STRING_TAG:
case Symbol.CONSTANT_METHOD_TYPE_TAG:
case Symbol.CONSTANT_PACKAGE_TAG:
case Symbol.CONSTANT_MODULE_TAG:
cpInfoSize = 3;
break;
default:
throw new IllegalArgumentException();
}
currentCpInfoOffset += cpInfoSize;
}
this.maxStringLength = currentMaxStringLength;
// The Classfile's access_flags field is just after the last constant pool entry.
this.header = currentCpInfoOffset;
// 读取 BootstrapMethods 属性(如果存在)
int currentAttributeOffset = getFirstAttributeOffset();
int[] currentBootstrapMethodOffsets = null;
for (int i = readUnsignedShort(currentAttributeOffset - 2); i > 0; --i) {
// 读取每个 attribute_info 的属性名和属性长度
String attributeName = readUTF8(currentAttributeOffset, new char[maxStringLength]);
int attributeLength = readInt(currentAttributeOffset + 2);
currentAttributeOffset += 6;
// 如果当前属性名为 BootstrapMethods ,则进入处理逻辑
if (Constants.BOOTSTRAP_METHODS.equals(attributeName)) {
// Read the num_bootstrap_methods field and create an array of this size.
currentBootstrapMethodOffsets = new int[readUnsignedShort(currentAttributeOffset)];
// Compute and store the offset of each 'bootstrap_methods' array field entry.
int currentBootstrapMethodOffset = currentAttributeOffset + 2;
for (int j = 0; j < currentBootstrapMethodOffsets.length; ++j) {
currentBootstrapMethodOffsets[j] = currentBootstrapMethodOffset;
// Skip the bootstrap_method_ref and num_bootstrap_arguments fields (2 bytes each),
// as well as the bootstrap_arguments array field (of size num_bootstrap_arguments * 2).
currentBootstrapMethodOffset +=
4 + readUnsignedShort(currentBootstrapMethodOffset + 2) * 2;
}
}
currentAttributeOffset += attributeLength;
}
this.bootstrapMethodOffsets = currentBootstrapMethodOffsets;
}
到此为止,ClassReader 已经解析了包括 magic, minor version, major version, constant pool 。 但是,诸如 field_info, method_info, attribute_info 等仍然没有得到处理。
accept(…)
访问者模式的核心操作就是结构化对象接受(accept)访问者对象实例的访问,并将结构化内容完全暴露给访问者。
从抽象的方法角度看,可以理解成:
// --- 结构化对象的 accept() 方法 ---
public void accept(Visitor visitor) {
visitor.visit(this);
}
// --- 访问者对象的 visit() 方法 ---
public Xxx visit(Element element) {
// 若干关于 element 的读操作 + 其它操作
}
public void accept(
final ClassVisitor classVisitor,
final Attribute[] attributePrototypes,
final int parsingOptions) {
// 定义 Context 作为辅助类,暂存被访问过程的“累积状态”
Context context = new Context();
context.attributePrototypes = attributePrototypes;
/**
* 解析选项:
* 1. SKIP_CODE - 不解析 CODE 属性
* 2. SKIP_DEBUG - 不解析 DEBUG 相关的属性(例如SourceFile, SourceDebugExtension, LocalVariableTable, LocalVariableTypeTable, LineNumberTable)
* 4. SKIP_FRAMES - 跳过对 StackMap 和 StackMapTable 属性的解析
* ...
*/
context.parsingOptions = parsingOptions;
// 从常量池读取常量所使用的缓冲数字
context.charBuffer = new char[maxStringLength];
// Read the access_flags, this_class, super_class, interface_count and interfaces fields.
// 解析访问控制, 当前类, 父类, 接口数量与接口值等
char[] charBuffer = context.charBuffer;
int currentOffset = header;
int accessFlags = readUnsignedShort(currentOffset);
String thisClass = readClass(currentOffset + 2, charBuffer);
String superClass = readClass(currentOffset + 4, charBuffer);
String[] interfaces = new String[readUnsignedShort(currentOffset + 6)];
currentOffset += 8;
for (int i = 0; i < interfaces.length; ++i) {
interfaces[i] = readClass(currentOffset, charBuffer);
currentOffset += 2;
}
// Read the class attributes (the variables are ordered as in Section 4.7 of the JVMS).
// Attribute offsets exclude the attribute_name_index and attribute_length fields.
// - The offset of the InnerClasses attribute, or 0.
int innerClassesOffset = 0;
// - The offset of the EnclosingMethod attribute, or 0.
int enclosingMethodOffset = 0;
// - The string corresponding to the Signature attribute, or null.
String signature = null;
// - The string corresponding to the SourceFile attribute, or null.
String sourceFile = null;
// - The string corresponding to the SourceDebugExtension attribute, or null.
String sourceDebugExtension = null;
// - The offset of the RuntimeVisibleAnnotations attribute, or 0.
int runtimeVisibleAnnotationsOffset = 0;
// - The offset of the RuntimeInvisibleAnnotations attribute, or 0.
int runtimeInvisibleAnnotationsOffset = 0;
// - The offset of the RuntimeVisibleTypeAnnotations attribute, or 0.
int runtimeVisibleTypeAnnotationsOffset = 0;
// - The offset of the RuntimeInvisibleTypeAnnotations attribute, or 0.
int runtimeInvisibleTypeAnnotationsOffset = 0;
// - The offset of the Module attribute, or 0.
int moduleOffset = 0;
// - The offset of the ModulePackages attribute, or 0.
int modulePackagesOffset = 0;
// - The string corresponding to the ModuleMainClass attribute, or null.
String moduleMainClass = null;
// - The string corresponding to the NestHost attribute, or null.
String nestHostClass = null;
// - The offset of the NestMembers attribute, or 0.
int nestMembersOffset = 0;
// - The non standard attributes (linked with their {@link Attribute#nextAttribute} field).
// This list in the <i>reverse order</i> or their order in the ClassFile structure.
Attribute attributes = null;
// 解析 Class 持有的属性
int currentAttributeOffset = getFirstAttributeOffset();
for (int i = readUnsignedShort(currentAttributeOffset - 2); i > 0; --i) {
// Read the attribute_info's attribute_name and attribute_length fields.
String attributeName = readUTF8(currentAttributeOffset, charBuffer);
int attributeLength = readInt(currentAttributeOffset + 2);
currentAttributeOffset += 6;
// The tests are sorted in decreasing frequency order (based on frequencies observed on
// typical classes).
if (Constants.SOURCE_FILE.equals(attributeName)) {
sourceFile = readUTF8(currentAttributeOffset, charBuffer);
} else if (Constants.INNER_CLASSES.equals(attributeName)) {
innerClassesOffset = currentAttributeOffset;
} else if (Constants.ENCLOSING_METHOD.equals(attributeName)) {
enclosingMethodOffset = currentAttributeOffset;
} else if (Constants.NEST_HOST.equals(attributeName)) {
nestHostClass = readClass(currentAttributeOffset, charBuffer);
} else if (Constants.NEST_MEMBERS.equals(attributeName)) {
nestMembersOffset = currentAttributeOffset;
} else if (Constants.SIGNATURE.equals(attributeName)) {
signature = readUTF8(currentAttributeOffset, charBuffer);
} else if (Constants.RUNTIME_VISIBLE_ANNOTATIONS.equals(attributeName)) {
runtimeVisibleAnnotationsOffset = currentAttributeOffset;
} else if (Constants.RUNTIME_VISIBLE_TYPE_ANNOTATIONS.equals(attributeName)) {
runtimeVisibleTypeAnnotationsOffset = currentAttributeOffset;
} else if (Constants.DEPRECATED.equals(attributeName)) {
accessFlags |= Opcodes.ACC_DEPRECATED;
} else if (Constants.SYNTHETIC.equals(attributeName)) {
accessFlags |= Opcodes.ACC_SYNTHETIC;
} else if (Constants.SOURCE_DEBUG_EXTENSION.equals(attributeName)) {
sourceDebugExtension =
readUTF(currentAttributeOffset, attributeLength, new char[attributeLength]);
} else if (Constants.RUNTIME_INVISIBLE_ANNOTATIONS.equals(attributeName)) {
runtimeInvisibleAnnotationsOffset = currentAttributeOffset;
} else if (Constants.RUNTIME_INVISIBLE_TYPE_ANNOTATIONS.equals(attributeName)) {
runtimeInvisibleTypeAnnotationsOffset = currentAttributeOffset;
} else if (Constants.MODULE.equals(attributeName)) {
moduleOffset = currentAttributeOffset;
} else if (Constants.MODULE_MAIN_CLASS.equals(attributeName)) {
moduleMainClass = readClass(currentAttributeOffset, charBuffer);
} else if (Constants.MODULE_PACKAGES.equals(attributeName)) {
modulePackagesOffset = currentAttributeOffset;
} else if (Constants.BOOTSTRAP_METHODS.equals(attributeName)) {
// This attribute is read in the constructor.
} else {
Attribute attribute =
readAttribute(
attributePrototypes,
attributeName,
currentAttributeOffset,
attributeLength,
charBuffer,
-1,
null);
attribute.nextAttribute = attributes;
attributes = attribute;
}
currentAttributeOffset += attributeLength;
}
// 第一个 .visit() 。让 ClassVisitor 的实现类处理当前类的版本号, 访问控制标志, 当前类, 结构, 父类, 接口
// 具体 visit() 由实现类随意定制。例如,针对于那些有打印功能的访问者实现类,直接打印也不失为一种有效的访问操作
// Visit the class declaration. The minor_version and major_version fields start 6 bytes before
// the first constant pool entry, which itself starts at cpInfoOffsets[1] - 1 (by definition).
classVisitor.visit(
readInt(cpInfoOffsets[1] - 7), accessFlags, thisClass, signature, superClass, interfaces);
// 访问 SourceFile 和 SourceDebugExtenstion 属性
// Visit the SourceFile and SourceDebugExtenstion attributes.
if ((parsingOptions & SKIP_DEBUG) == 0
&& (sourceFile != null || sourceDebugExtension != null)) {
classVisitor.visitSource(sourceFile, sourceDebugExtension);
}
// Visit the Module, ModulePackages and ModuleMainClass attributes.
if (moduleOffset != 0) {
readModule(classVisitor, context, moduleOffset, modulePackagesOffset, moduleMainClass);
}
// Visit the NestHost attribute.
if (nestHostClass != null) {
classVisitor.visitNestHostExperimental(nestHostClass);
}
// Visit the EnclosingMethod attribute.
if (enclosingMethodOffset != 0) {
String className = readClass(enclosingMethodOffset, charBuffer);
int methodIndex = readUnsignedShort(enclosingMethodOffset + 2);
String name = methodIndex == 0 ? null : readUTF8(cpInfoOffsets[methodIndex], charBuffer);
String type = methodIndex == 0 ? null : readUTF8(cpInfoOffsets[methodIndex] + 2, charBuffer);
classVisitor.visitOuterClass(className, name, type);
}
// Visit the RuntimeVisibleAnnotations attribute.
if (runtimeVisibleAnnotationsOffset != 0) {
int numAnnotations = readUnsignedShort(runtimeVisibleAnnotationsOffset);
int currentAnnotationOffset = runtimeVisibleAnnotationsOffset + 2;
while (numAnnotations-- > 0) {
// Parse the type_index field.
String annotationDescriptor = readUTF8(currentAnnotationOffset, charBuffer);
currentAnnotationOffset += 2;
// Parse num_element_value_pairs and element_value_pairs and visit these values.
currentAnnotationOffset =
readElementValues(
classVisitor.visitAnnotation(annotationDescriptor, /* visible = */ true),
currentAnnotationOffset,
/* named = */ true,
charBuffer);
}
}
// Visit the RuntimeInvisibleAnnotations attribute.
if (runtimeInvisibleAnnotationsOffset != 0) {
int numAnnotations = readUnsignedShort(runtimeInvisibleAnnotationsOffset);
int currentAnnotationOffset = runtimeInvisibleAnnotationsOffset + 2;
while (numAnnotations-- > 0) {
// Parse the type_index field.
String annotationDescriptor = readUTF8(currentAnnotationOffset, charBuffer);
currentAnnotationOffset += 2;
// Parse num_element_value_pairs and element_value_pairs and visit these values.
currentAnnotationOffset =
readElementValues(
classVisitor.visitAnnotation(annotationDescriptor, /* visible = */ false),
currentAnnotationOffset,
/* named = */ true,
charBuffer);
}
}
// Visit the RuntimeVisibleTypeAnnotations attribute.
if (runtimeVisibleTypeAnnotationsOffset != 0) {
int numAnnotations = readUnsignedShort(runtimeVisibleTypeAnnotationsOffset);
int currentAnnotationOffset = runtimeVisibleTypeAnnotationsOffset + 2;
while (numAnnotations-- > 0) {
// Parse the target_type, target_info and target_path fields.
currentAnnotationOffset = readTypeAnnotationTarget(context, currentAnnotationOffset);
// Parse the type_index field.
String annotationDescriptor = readUTF8(currentAnnotationOffset, charBuffer);
currentAnnotationOffset += 2;
// Parse num_element_value_pairs and element_value_pairs and visit these values.
currentAnnotationOffset =
readElementValues(
classVisitor.visitTypeAnnotation(
context.currentTypeAnnotationTarget,
context.currentTypeAnnotationTargetPath,
annotationDescriptor,
/* visible = */ true),
currentAnnotationOffset,
/* named = */ true,
charBuffer);
}
}
// Visit the RuntimeInvisibleTypeAnnotations attribute.
if (runtimeInvisibleTypeAnnotationsOffset != 0) {
int numAnnotations = readUnsignedShort(runtimeInvisibleTypeAnnotationsOffset);
int currentAnnotationOffset = runtimeInvisibleTypeAnnotationsOffset + 2;
while (numAnnotations-- > 0) {
// Parse the target_type, target_info and target_path fields.
currentAnnotationOffset = readTypeAnnotationTarget(context, currentAnnotationOffset);
// Parse the type_index field.
String annotationDescriptor = readUTF8(currentAnnotationOffset, charBuffer);
currentAnnotationOffset += 2;
// Parse num_element_value_pairs and element_value_pairs and visit these values.
currentAnnotationOffset =
readElementValues(
classVisitor.visitTypeAnnotation(
context.currentTypeAnnotationTarget,
context.currentTypeAnnotationTargetPath,
annotationDescriptor,
/* visible = */ false),
currentAnnotationOffset,
/* named = */ true,
charBuffer);
}
}
// 访问非标准的属性
// Visit the non standard attributes.
while (attributes != null) {
// Copy and reset the nextAttribute field so that it can also be used in ClassWriter.
Attribute nextAttribute = attributes.nextAttribute;
attributes.nextAttribute = null;
classVisitor.visitAttribute(attributes);
attributes = nextAttribute;
}
// Visit the NestedMembers attribute.
if (nestMembersOffset != 0) {
int numberOfNestMembers = readUnsignedShort(nestMembersOffset);
int currentNestMemberOffset = nestMembersOffset + 2;
while (numberOfNestMembers-- > 0) {
classVisitor.visitNestMemberExperimental(readClass(currentNestMemberOffset, charBuffer));
currentNestMemberOffset += 2;
}
}
// Visit the InnerClasses attribute.
if (innerClassesOffset != 0) {
int numberOfClasses = readUnsignedShort(innerClassesOffset);
int currentClassesOffset = innerClassesOffset + 2;
while (numberOfClasses-- > 0) {
classVisitor.visitInnerClass(
readClass(currentClassesOffset, charBuffer),
readClass(currentClassesOffset + 2, charBuffer),
readUTF8(currentClassesOffset + 4, charBuffer),
readUnsignedShort(currentClassesOffset + 6));
currentClassesOffset += 8;
}
}
// 访问字段和方法
// Visit the fields and methods.
int fieldsCount = readUnsignedShort(currentOffset);
currentOffset += 2;
while (fieldsCount-- > 0) {
currentOffset = readField(classVisitor, context, currentOffset);
}
int methodsCount = readUnsignedShort(currentOffset);
currentOffset += 2;
while (methodsCount-- > 0) {
currentOffset = readMethod(classVisitor, context, currentOffset);
}
// Visit the end of the class.
classVisitor.visitEnd();
}
小结
其实,将整个 ClassReader 理解成一个对 .class 字节文件的解析器不失为一种可行的认知。
- 在构造方法中完成对 .class 文件 minor_version, major_version 的确认。
- 继而完成对整个 Constants_pool 的解析
- 以及 BootstarpMethod 属性的定位
- 之后在 accept(…) 方法中逐一调用相应的访问者实现类实现对不同内容的访问。
但是,要注意的是,ClassReader 绝对不会涉及到对其解析的 .class 文件内容的写操作。 所有的写操作都基于不同的目的,在 ClassVisitor 中实现。
ClassVisitor
Java .class 的访问者,按照严格的顺序规范逐一调用
visit [ visitSource ] [ visitModule ][ visitNestHost ][ visitOuterClass ] ( visitAnnotation | visitTypeAnnotation | visitAttribute )* ( visitNestMember | visitInnerClass | visitField | visitMethod )* visitEnd.
各个 visitXXX 方法
public abstract class ClassVisitor {
/**
* 访问类的首部
*/
public void visit(final int version, final int access, final String name, final String signature, final String superName, final String[] interfaces) {}
/**
* 访问类的源文件名等
*/
public void visitSource(final String source, final String debug) {}
/**
* 访问与类关联的模块
*/
public ModuleVisitor visitModule(final String name, final int access, final String version) {}
public void visitOuterClass(final String owner, final String name, final String descriptor) {}
public AnnotationVisitor visitAnnotation(final String descriptor, final boolean visible) {}
public AnnotationVisitor visitTypeAnnotation(
final int typeRef, final TypePath typePath, final String descriptor, final boolean visible) {}
public void visitAttribute(final Attribute attribute) {}
public void visitInnerClass(
final String name, final String outerName, final String innerName, final int access) {}
/**
* 访问类的变量
*/
public FieldVisitor visitField(
final int access,
final String name,
final String descriptor,
final String signature,
final Object value) {}
/**
* 访问类的方法
*/
public MethodVisitor visitMethod(
final int access,
final String name,
final String descriptor,
final String signature,
final String[] exceptions) {}
public void visitEnd() {}
}
随着 visitXxx() 方法的逐一执行,ClassVisitor 将对当前的 .class 文件越来越熟悉,并逐渐补全常量池(由 SymbolTable 持有并维护)
总结
到此为止,对整个 ClassReader & ClassVisitor 将有一个基础而简单的印象。
ClassReader 通过对 .class 文件字节码的解析而获得对这个类的具体印象(更多的偏向是随意访问 .class 的各种细节)。
ClassVisitor 通过 visitXxx(…) 方法,由其它对象(可以是 ClassReader, 也可以直接是 Coder)逐渐对其开放一些 .class 的细节, 但需要 ClassVisitor 自行维护获得的内容(如果有必要的话)。由此得到对 .class 全部内容的了解(当然,如果本身 visitXxx() 得到的内容不全,则了解的自然有限)。
__ __
/ _| __ _ _ __ __ _ / _| ___ _ __ __ _
| |_ / _` | '_ \ / _` | |_ / _ \ '_ \ / _` |
| _| (_| | | | | (_| | _| __/ | | | (_| |
|_| \__,_|_| |_|\__, |_| \___|_| |_|\__, |
|___/ |___/