mirror of
https://github.com/zaaarf/lillero-book.git
synced 2024-12-22 10:34:52 +01:00
chore: reorganize + index skeleton
This commit is contained in:
parent
28760cdd97
commit
a7d6641bfd
9 changed files with 36 additions and 19 deletions
|
@ -8,4 +8,4 @@ In short, this is no replacement for the ASM manual: think of the Lillero Book a
|
||||||
## Building
|
## Building
|
||||||
This is built with [mdbook](https://github.com/rust-lang/mdBook): simply install `mdbook`, clone this, and run `mdbook build` in the root folder. You'll find the compiled and static html in the `book` subfolder.
|
This is built with [mdbook](https://github.com/rust-lang/mdBook): simply install `mdbook`, clone this, and run `mdbook build` in the root folder. You'll find the compiled and static html in the `book` subfolder.
|
||||||
|
|
||||||
You can also find a live version [here](https://lll.fantabos.co/book/), if you prefer.
|
You can also find a live version [here](https://lll.zaaarf.foo/book/), if you prefer.
|
||||||
|
|
|
@ -7,4 +7,4 @@ In short, ASM patching should always be the very last resort. That is not to say
|
||||||
|
|
||||||
Though reviled by many, ASM patching remains one of the most powerful tools in the Java modder's arsenal. Like every tool, ASM patching is not evil in itself. When used correctly, it can solve just about any problem elegantly with a minuscule footprint. When done incorrectly, it can wreak havoc on the entire environment, causing inexplicable crashes and pulling the rug from underneath everyone else wishing to modify the program just like you.
|
Though reviled by many, ASM patching remains one of the most powerful tools in the Java modder's arsenal. Like every tool, ASM patching is not evil in itself. When used correctly, it can solve just about any problem elegantly with a minuscule footprint. When done incorrectly, it can wreak havoc on the entire environment, causing inexplicable crashes and pulling the rug from underneath everyone else wishing to modify the program just like you.
|
||||||
|
|
||||||
This latter issue has led most of the new generation of modders to reject ASM patching altogether, in favour of higher-level solutions, ditching the complexities of bytecode in favour of the checked safety of plain Java. In Minecraft's case, one such solution is [Mixin](https://github.com/SpongePowered/Mixin/).
|
This latter issue has led most of the new generation of modders to reject ASM patching altogether, in favour of higher-level solutions, ditching the complexities of bytecode in favour of the relative safety of plain Java. In Minecraft's case, one such solution is [Mixin](https://github.com/SpongePowered/Mixin/).
|
||||||
|
|
|
@ -1,17 +1,8 @@
|
||||||
# Bytecode
|
# Bytecode
|
||||||
Before we get into the specifics of bytecode manipulation, you should understand what exactly you will be dealing with. Patching essentially consists in modifying the *bytecode* of a class. If you're familiar with any flavour of assembly language, this will all look very familiar.
|
Before we get into the specifics of bytecode manipulation, you should understand what exactly you will be dealing with. Patching essentially consists in modifying the *bytecode* of a class. If you're familiar with any flavour of assembly language, this will all look very familiar.
|
||||||
|
|
||||||
Essentially, any programming language *targeting* the JVM (short for Java Virtual Machine) will be convereted by its compiler into machine code. Except that the machine code isn't going to be the one of *your* computer, as it happens with other programming languages: it will be the machine code of the JVM since it will be the one running your program anyway.
|
Essentially, any programming language *targeting* the JVM (short for Java Virtual Machine) will be converted by its compiler into machine code. Except that the machine code isn't going to be the one of *your* computer, as it happens with other programming languages: it will be the machine code of the JVM since it will be the one running your program anyway.
|
||||||
|
|
||||||
*Java bytecode* is a human-readable representation of the machine code that the JVM is meant to interpret. With the right tools, it can be manipulated to change the behaviour of a program - which brings us here. Java bytecode is relatively high-level when compared to its native counterpart, including support for more abstract concepts like classes and inheritance, but still requires a way of thinking much closer to the functioning of a machine than what is needed for regular programming.
|
*Java bytecode* is a human-readable representation of the machine code that the JVM is meant to interpret. With the right tools, it can be manipulated to change the behaviour of a program - which brings us here. Java bytecode is relatively high-level when compared to its native counterpart, including support for more abstract concepts like classes and inheritance, but still requires a way of thinking much closer to the functioning of a machine than what is needed for regular programming.
|
||||||
|
|
||||||
Bytecode instructions are made up of various parts; first comes the *opcode*, a numerical ID (though you work with human-readable aliases for these numbers) then come a number of arguments which may vary depending on the opcode.
|
Bytecode instructions are made up of various parts; first comes the *opcode*, a numerical ID (though you work with human-readable aliases for these numbers) then come a number of arguments which may vary depending on the opcode.
|
||||||
|
|
||||||
## Stack-oriented programming
|
|
||||||
If you've ever attended any formal programming course, you'll be certainly familiar with the concepts of *stack* and *heap*. While on Java they'll at most be an occasional passing thought, when dealing with bytecode they become central.
|
|
||||||
|
|
||||||
The stack is a quickly-accessible memory region that follows the rule *first in, last out*. It's often compared to a stack of plates: you can only ever add (*push*) new plates on the top, and can only ever take (*pop*) the one on the very top. It's highly efficient, but anything that gets put on the stack must *have* a known memory size at compile time. This makes it suitable for working with primitives, but not quite as much for objects. Those follow different rules.
|
|
||||||
|
|
||||||
Objects are stored on the heap, and only a reference to their memory region - a map of sorts to find where their data is located - is pushed onto the stack. The heap is a messier, but bigger place: it's slower, but it allows retrieval of values from any point and doesn't need to know in advance the size of everything.
|
|
||||||
|
|
||||||
Most bytecode instructions affect the stack in some way, either by taking its arguments from it or by pushing the result of the operation onto it.
|
|
2
src/2_patching/bytecode_examples.md
Normal file
2
src/2_patching/bytecode_examples.md
Normal file
|
@ -0,0 +1,2 @@
|
||||||
|
# Bytecode examples
|
||||||
|
TODO
|
|
@ -1,8 +1,8 @@
|
||||||
# Nodes
|
# Nodes
|
||||||
The [ASM](https://asm.ow2.io/) library represents sequences of bytecode as [doubly linked lists](https://en.wikipedia.org/wiki/Doubly_linked_list), with the [`InsnList`](https://asm.ow2.io/javadoc/org/objectweb/asm/tree/InsnList.html) type. Lillero provides an extended functionality
|
The [ASM](https://asm.ow2.io/) library represents sequences of bytecode as [doubly linked lists](https://en.wikipedia.org/wiki/Doubly_linked_list), with the [`InsnList`](https://asm.ow2.io/javadoc/org/objectweb/asm/tree/InsnList.html) type. Lillero provides and supports an extended version with some additional functionality as [`InsnSequence`](https://lll.zaaarf.foo/javadoc/lillero/ftbsc/lll/tools/InsnSequence.html).
|
||||||
|
|
||||||
Each instruction is a node, represented by [various subclasses](https://asm.ow2.io/javadoc/org/objectweb/asm/tree/package-summary.html) of [`AbstractInsnNode`](https://asm.ow2.io/javadoc/org/objectweb/asm/tree/AbstractInsnNode.html); each node contains an opcode, a number of parameters depending on the opcode type, and references to the preceding and following nodes.
|
Each instruction is a node, represented by [various subclasses](https://asm.ow2.io/javadoc/org/objectweb/asm/tree/package-summary.html) of [`AbstractInsnNode`](https://asm.ow2.io/javadoc/org/objectweb/asm/tree/AbstractInsnNode.html); each node contains an opcode, a number of parameters depending on the opcode type, and references to the preceding and following nodes.
|
||||||
|
|
||||||
The `InsnList` representing the method's nodes is `MethodNode`'s `instructions` field. You can perform all operations you'd expect: append, insert, remove, etcetera. You should aim to leave the smallest possible footprint on the method, so *removing* nodes is almost always a bad idea. You can achieve the same result by *jumping over* the part you wish to remove.
|
The `InsnList` representing the method's nodes is `MethodNode`'s `instructions` field. You can perform all operations you'd expect: append, insert, remove, etcetera. You should aim to leave the smallest possible footprint on the method, so *removing* nodes is almost always a bad idea. You can achieve the same result by *jumping over* the part you wish to remove, without breaking lookup done by other patchers.
|
||||||
|
|
||||||
We'll now check out the various types of instruction nodes; you can find a detailed list of opcodes, with explanations, both on [this Wikipedia page](https://en.wikipedia.org/wiki/List_of_Java_bytecode_instructions) and on the [Java SE Specifications](https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-6.html).
|
We'll now broadly check out the various types of instruction nodes; you can find a detailed list of opcodes, with explanations, both on [this Wikipedia page](https://en.wikipedia.org/wiki/List_of_Java_bytecode_instructions) and on the [Java SE Specifications](https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-6.html), so going over them one by one seems futile. Just keep the reference at hand when patching, and you'll be fine: it's not like anybody actually knows all of their functionalities by heart. At least, I hope not.
|
||||||
|
|
|
@ -1,6 +1,8 @@
|
||||||
# Patching
|
# Patching
|
||||||
Since you are applying changes to the bytecode of a class, this must necessarily happen before said class is loaded in memory. The component that applies said changes is called a *loader*; don't concern yourself on the inner workings of loaders for now, just know that they are in charge of the initial step: we'll cover them in detail in their own chapter.
|
Since you are applying changes to the bytecode of a class, this must necessarily happen before said class is loaded in memory. The component that applies said changes is called a *loader*; don't concern yourself on the inner workings of loaders for now, just know that they are in charge of that initial step: we'll cover them in detail in their own chapter.
|
||||||
|
|
||||||
|
Generally speaking, you can solve *any* problem that can be solved via patching by modifying one or more methods in the correct way. This is preferrable, as you're unlikely to inadvertedly break compatibility with other parts of the program relying on that method if you stick to making small changed to the function body.
|
||||||
|
|
||||||
Suppose that you already have a working loader in place. This loader calls your *injector method*, and passes it a [`ClassNode`](https://asm.ow2.io/javadoc/org/objectweb/asm/tree/ClassNode.html) and a [`MethodNode`](https://asm.ow2.io/javadoc/org/objectweb/asm/tree/MethodNode.html) as arguments, representing respectively the container class and the method you're targeting. This is the most common type of ASM patching, and it's probably why you're here; more advanced subjects may be covered in additional chapters later on.
|
Suppose that you already have a working loader in place. This loader calls your *injector method*, and passes it a [`ClassNode`](https://asm.ow2.io/javadoc/org/objectweb/asm/tree/ClassNode.html) and a [`MethodNode`](https://asm.ow2.io/javadoc/org/objectweb/asm/tree/MethodNode.html) as arguments, representing respectively the container class and the method you're targeting. This is the most common type of ASM patching, and it's probably why you're here; more advanced subjects may be covered in additional chapters later on.
|
||||||
|
|
||||||
At a glance, this might seem restrictive. However, do keep in mind that even code outside of methods - in field declarations, in loose blocks, or in `static` blocks - is actually considered to be part of a method by the compiler. Specifically, the constructor (`<init>`) for instance fields and loose blocks, and the static constructor (`<clinit>`) for static fields and `static` blocks.
|
At a glance, if you're targeting something written in Java, this might seem restrictive. However, do keep in mind that even code outside of methods - in field declarations, in loose blocks, or in `static` blocks - is actually considered to be part of a method by the compiler. Specifically, the constructor (`<init>`) for instance fields and loose blocks, and the static constructor (`<clinit>`) for static fields and `static` blocks.
|
||||||
|
|
8
src/2_patching/stack.md
Normal file
8
src/2_patching/stack.md
Normal file
|
@ -0,0 +1,8 @@
|
||||||
|
# Stack-oriented programming
|
||||||
|
If you've ever attended any formal programming course, you'll be certainly familiar with the concepts of *stack* and *heap*. While working on regular Java they'll at most be an occasional passing thought, but when dealing with bytecode they become central. In fact, like most assembly languages, Java bytecode is what you'd call a [*stack-oriented* programming language](https://en.wikipedia.org/wiki/Stack-oriented_programming).
|
||||||
|
|
||||||
|
The stack is a quickly-accessible memory region that follows the rule *first in, last out*. It's often compared to a stack of plates: you can only ever add (*push*) new plates on the top, and can only ever take (*pop*) the one on the very top. It's highly efficient, but anything that gets put on the stack must *have* a known memory size at compile time. This makes it suitable for working with primitives, but not quite as much for objects. Those follow different rules.
|
||||||
|
|
||||||
|
When you create a new object, memory is allocated on the heap, and a *reference* to the object is pushed onto the stack. A reference is a hexadecimal number, of known and fixed size, that represents the *memory address* of the location of a certain object. The heap is a messier, but bigger place: it's slower, but it allows retrieval of values from any point and doesn't need to know the size of everything in advance.
|
||||||
|
|
||||||
|
Most bytecode instructions affect the stack in some way. Depending on the opcode, values may be *popped* from the stack and/or a return value may be *pushed* onto it. Understanding how the stack works and how to work with it are necessary steps to gaining a true understanding of bytecode.
|
|
@ -5,4 +5,18 @@
|
||||||
- [Why Lillero?](./1_introduction/why_lillero.md)
|
- [Why Lillero?](./1_introduction/why_lillero.md)
|
||||||
- [Patching Methods](./2_patching/patching.md)
|
- [Patching Methods](./2_patching/patching.md)
|
||||||
- [Bytecode](./2_patching/bytecode.md)
|
- [Bytecode](./2_patching/bytecode.md)
|
||||||
- [Nodes](./2_patching/nodes.md)
|
- [Stack-oriented Programming](./2_patching/stack.md)
|
||||||
|
- [Examples](./2_patching/bytecode_examples.md)
|
||||||
|
- [Nodes](./2_patching/nodes.md)
|
||||||
|
- [Jump Nodes](./2_patching/jump_nodes.md)
|
||||||
|
- [Invoke Dynamic Nodes](./2_patching/jump_nodes.md)
|
||||||
|
- [Integer Nodes](./2_patching/jump_nodes.md)
|
||||||
|
- [Integer Increment Nodes](./2_patching/jump_nodes.md)
|
||||||
|
- [LDC Nodes](./2_patching/jump_nodes.md)
|
||||||
|
- [Lookup Switch Nodes](./2_patching/jump_nodes.md)
|
||||||
|
- [MultiANewArray Nodes](./2_patching/jump_nodes.md)
|
||||||
|
- [Method Nodes](./2_patching/jump_nodes.md)
|
||||||
|
- [Table Switch Nodes](./2_patching/jump_nodes.md)
|
||||||
|
- [Type Nodes](./2_patching/jump_nodes.md)
|
||||||
|
- [Var Nodes](./2_patching/jump_nodes.md)
|
||||||
|
- [Pattern Matching](./2_patching/patterns.md)
|
|
@ -6,7 +6,7 @@ Lillero is made up of multiple components:
|
||||||
|
|
||||||
- [Lillero](https://github.com/zaaarf/lillero), the core library.
|
- [Lillero](https://github.com/zaaarf/lillero), the core library.
|
||||||
- [Lillero-processor](https://github.com/zaaarf/Lillero-processor), the annotation processor.
|
- [Lillero-processor](https://github.com/zaaarf/Lillero-processor), the annotation processor.
|
||||||
- [Lillero-mapper](https://github.com/zaaarf/lillero-mapper), a library which provides the ability to read multiple obfuscation mapping formats.
|
- [Lillero-mapper](https://github.com/zaaarf/lillero-mapper), a library providing the ability to read multiple obfuscation mapping formats.
|
||||||
- [Lillero-mapping-writer](https://github.com/zaaarf/Lillero-mapping-writer), a CLI tool for converting and inverting mapping formats.
|
- [Lillero-mapping-writer](https://github.com/zaaarf/Lillero-mapping-writer), a CLI tool for converting and inverting mapping formats.
|
||||||
|
|
||||||
On top of these, there's [Lillero-loader](https://github.com/zaaarf/lillero-loader), a sample loader, in form of a plugin for Minecraft Forge's [ModLauncher](https://github.com/McModLauncher/modlauncher).
|
On top of these, there's [Lillero-loader](https://github.com/zaaarf/lillero-loader), a sample loader, in form of a plugin for Minecraft Forge's [ModLauncher](https://github.com/McModLauncher/modlauncher).
|
||||||
|
|
Loading…
Reference in a new issue