feat: writing basics

2025-01-03 07:54:53 +01:00 · 2024-12-21 23:24:31 +01:00 · 2024-12-21 23:24:31 +01:00 · 99aa20c23d
commit 99aa20c23d
parent d6ec6dd14b
26 changed files with 190 additions and 20 deletions
--- a/.editorconfig
+++ b/.editorconfig
@ -0,0 +1,6 @@
 [*]
 end_of_line = lf
 insert_final_newline = true
 charset = utf-8
 indent_style = tab
 indent_size = 4
--- a/src/2_patching/bytecode.md
+++ b/src/2_patching/bytecode.md
@ -1 +0,0 @@
 # Bytecode
--- a/src/2_patching/bytecode_examples.md
+++ b/src/2_patching/bytecode_examples.md
@ -1 +0,0 @@
 # Examples
--- a/src/2_patching/jump_nodes.md
+++ b/src/2_patching/jump_nodes.md
@ -1 +0,0 @@
 # Labels and Jump Nodes
--- a/src/2_patching/nodes/integer.md
+++ b/src/2_patching/nodes/integer.md
@ -0,0 +1,2 @@
 # Integer Nodes
 TODO
--- a/src/2_patching/nodes/integer_increment.md
+++ b/src/2_patching/nodes/integer_increment.md
@ -0,0 +1,2 @@
 # Integer Increment Nodes
 TODO
--- a/src/2_patching/nodes/introduction.md
+++ b/src/2_patching/nodes/introduction.md
@ -6,3 +6,6 @@ Each instruction is a node, represented by [various subclasses](https://asm.ow2.
 The `InsnList` representing the method's nodes is `MethodNode`'s `instructions` field. You can perform all operations you'd expect: append, insert, remove, etcetera. You should aim to leave the smallest possible footprint on the method, so *removing* nodes is almost always a bad idea. You can achieve the same result by *jumping over* the part you wish to remove, without breaking lookup done by other patchers.
 We'll now broadly check out the various types of instruction nodes; you can find a detailed list of opcodes, with explanations, both on [this Wikipedia page](https://en.wikipedia.org/wiki/List_of_Java_bytecode_instructions) and on the [Java SE Specifications](https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-6.html), so going over them one by one seems futile. Just keep the reference at hand when patching, and you'll be fine: it's not like anybody actually knows all of their functionalities by heart. At least, I hope not.
 ## Categorization
 This book will follow the same categorization of nodes from the ASM library. Specifically, it divides them by number and type of arguments that each node takes. We will go over them one-by-one.
--- a/src/2_patching/nodes/invokedynamic.md
+++ b/src/2_patching/nodes/invokedynamic.md
@ -0,0 +1,2 @@
 # Invoke Dynamic Nodes
 TODO
--- a/src/2_patching/nodes/jump.md
+++ b/src/2_patching/nodes/jump.md
@ -0,0 +1,3 @@
 # Jump Nodes
 TODO
 Jump nodes are your bread and butter. You will likely be using these more than all the others (except maybe for `POP`, depending on what you'll be doing).
--- a/src/2_patching/nodes/ldc.md
+++ b/src/2_patching/nodes/ldc.md
@ -0,0 +1,2 @@
 # LDC Nodes
 TODO
--- a/src/2_patching/nodes/lookup_switch.md
+++ b/src/2_patching/nodes/lookup_switch.md
@ -0,0 +1,2 @@
 # Lookup Switch Nodes
 TODO
--- a/src/2_patching/nodes/method.md
+++ b/src/2_patching/nodes/method.md
@ -0,0 +1,2 @@
 # Method Nodes
 TODO
--- a/src/2_patching/nodes/multi_a_new_array.md
+++ b/src/2_patching/nodes/multi_a_new_array.md
@ -0,0 +1,2 @@
 # MultiANewArray Nodes
 TODO
--- a/src/2_patching/nodes/non_instruction.md
+++ b/src/2_patching/nodes/non_instruction.md
@ -0,0 +1,7 @@
 # Non-Instruction Nodes
 Perhaps unintuitively, the first nodes that we are going to cover do *not* contain any instructions. These are one of three: **Line Numbers**, **Frame Changes** and **Labels*.
 You needn't concern yourself with the first two: line numbers provide the information used to print linenumbers in stacktraces, and frame changes signal a stackframe change. The ones that are actually interesting are labels.
 Labels, by themselves, do nothing: their purpose is to mark a location in the bytecode by giving it a name. Although labels generated by the compiler will generally be unintelligible to you (i.e. "L11"), you can actually name your labels whatever you want. What purpose do they serve? They can be combined with Jump Nodes (see the next chapter) to provide control flow.
 In the ASM library, you're looking for the class `LabelNode`.
--- a/src/2_patching/nodes/table_switch.md
+++ b/src/2_patching/nodes/table_switch.md
@ -0,0 +1,2 @@
 # Table Switch Nodes
 TODO
--- a/src/2_patching/nodes/type.md
+++ b/src/2_patching/nodes/type.md
@ -0,0 +1,2 @@
 # Type Nodes
 TODO
--- a/src/2_patching/nodes/var.md
+++ b/src/2_patching/nodes/var.md
@ -0,0 +1,2 @@
 # Var Nodes
 TODO
--- a/src/2_patching/patching.md
+++ b/src/2_patching/patching.md
@ -1,7 +1,7 @@
 # Patching
 Since you are applying changes to the bytecode of a class, this must necessarily happen before said class is loaded in memory. The component that applies said changes is called a *loader*; don't concern yourself on the inner workings of loaders for now, just know that they are in charge of that initial step: we'll cover them in detail in their own chapter.
-Generally speaking, you can solve *any* problem that can be solved via patching by modifying one or more methods in the correct way. This is preferrable, as you're unlikely to inadvertedly break compatibility with other parts of the program relying on that method if you stick to making small changed to the function body.
+Generally speaking, you can solve *any* problem that can be solved via patching by modifying one or more methods in the correct way. This is preferrable, as you're unlikely to inadvertedly break compatibility with other parts of the program relying on that method if you stick to making small changes to the function body.
 Suppose that you already have a working loader in place. This loader calls your *injector method*, and passes it a [`ClassNode`](https://asm.ow2.io/javadoc/org/objectweb/asm/tree/ClassNode.html) and a [`MethodNode`](https://asm.ow2.io/javadoc/org/objectweb/asm/tree/MethodNode.html) as arguments, representing respectively the container class and the method you're targeting. This is the most common type of ASM patching, and it's probably why you're here; more advanced subjects may be covered in additional chapters later on.
--- a/src/2_patching/patterns.md
+++ b/src/2_patching/patterns.md
@ -1 +0,0 @@
 # Pattern Matching
--- a/src/2_patching/stack.md
+++ b/src/2_patching/stack.md
@ -1 +0,0 @@
 # Stack-oriented Programming
--- a/src/2_patching/writing/basic.md
+++ b/src/2_patching/writing/basic.md
@ -0,0 +1,28 @@
 # Writing a Patch
 Let's assume that you've figured out all the boilerplate, or automated it with [Lillero-processor](https://github.com/zaaarf/lillero-processor/). If you are wondering how to use that, refer to the project's README. I don't particularly wish to maintain a second independent copy of that information.
 Take the following example:
 ```java
 private int counter = 0;
 public void incrementCounter() {
    this.counter++;
 }
 ```
 Assume that `counter` is not directly incremented anywhere, and all calls pass through the method. Your task is to break the counter, and ensure it stays zero.
 Again, you've already written your boilerplate: all that's left is the actual injection method. You have a `ClassNode` and a `MethodNode`; you probably don't need the `ClassNode` at all. How do you modify it, though? You should know that `method.instructions` is an [`InsnList`](https://asm.ow2.io/javadoc/org/objectweb/asm/tree/InsnList.html), which means you can manipulate it freely. One such way to do it (and really the only one you need in almost all tasks) is to insert new instruction nodes.
 Look at your code: in this case, with this assumption, the easiest way to achieve your task is obviously to return early. 
 ```java
 public void inject(ClassNode clazz, MethodNode method) {
 	method.instructions.insert(new InsnNode(RET));
 }
 ```
 The `insert` method added `RET` (which is equivalent to `return` without values) right at the start, not having specificed a position. While `javac` would refuse to compile a method like this one because it creates unreachable code, the bytecode sequence it would produce is actually perfectly valid; thus, using Lillero to create is perfectly valid. This is not the first discrepancy you will encounter between what `javac` wants you to do and what you actually can do.
 Unfortunately, most patches are not as straightforward.
--- a/src/2_patching/writing/collisions.md
+++ b/src/2_patching/writing/collisions.md
@ -0,0 +1,21 @@
 # Mitigating Collisions
 Despite being an above average programmer, having read this book and taken all the precautions on God's green earth, the unthinkable has still happened: your patches conflict wiht someone else's. That's fine, no need to panic. It may not even have been your fault. It may be the other guy's fault, or it may be that there is no conceivable way to implement this patch in a sturdier way. Regardless, let's assume that working together with the other guy is not an option, and that you absolutely have to fix it yourself.
 You have a few ways to go about this.
 ## Pattern matching as validation
 Assuming that the loader is implemented according to the requirements (see the relevant chapter), it's perfectly acceptable for pattern matching to fail. This merely indicates that someone else has tampered with the same area, and you don't want to risk a patch there. Therefore, you should take care to pattern-match all of the area that is critical to your patch, so that it will fail to apply if it's been tampered with. 
 In some cases, you may want to catch the `PatternNotFoundException` and re-throw it wrapping it as a `RuntimeException` so it doesn't get caught; however, that is a relatively rare occurrence, and typically is about a patch that is so core to your system that you have no conceivable way of recovering from. Anything that messes with the base code is prone to breaking, so take care.
 ## Wrapping extra code
 Assuming that you did all according to this specification, you only *added* nodes, never removing them. If you did, you can simply wrap all of your additional opcodes between a call to some sort of check and an `IFNE` on one side, and a label on the other. This way, all your extra code is self-contained. Let me also remind you, once again, that the resulting JVM code doesn't necessarily have to translate to valid Java.
 This option may be more suitable to cases where the buggy collision happens only when certain conditions are met: this way, your code is only off when it needs to be. And, since you're *injecting* the check itself as well, you can rely on all the information you can expose at runtime.
 Typically, you'd check against the thing that you *know* is breaking your code; if you can't, for some reason, you should add some sort of setting and check against that, so the user may disable this if he knows that some other patcher he's using conflicts with it.
 ## Environmental checks
 This is the most complicated (and least recommended) approach to take. However, it may be the only one in some cases. If you have some way to know who else is going to be altering the classpath at time the `inject` is called, you can do a check on that and avoid applying the patch altogether.
 In Minecraft, this can typically be implemented by using the mod loader's API to check whether other core mods are being applied, and if so which ones. It's unlikely to have good performance, but unlike the previous one, the check is only done once.
--- a/src/2_patching/writing/guidelines.md
+++ b/src/2_patching/writing/guidelines.md
@ -0,0 +1,20 @@
 # Guidelines
 As patching is merely another form of programming, there is no general "correct" answer that we can easily determine. If there was, this could all be automated.
 There are three main factors that affect the quality of a patch:
 - **Performance hit**, which is how much the change will damage performance.
 - **Invasiveness**, which is how likely the patch is to break other patches working on the same area.
 - **Fragility**, which is how likely the patch is to break when confronted with other patches working on the same bit.
 Depending on your specific environment, you may have some concerns or otherness. For instance, in the case of Minecraft, you typically care relatively little about performance (especially if it's just a matter of adding a few opcodes) but highly about invasiveness and fragility. In an environment where you can control what patches get applied or where you have the guarantee that every patcher is competent or at least guaranteed to try to fix their mistakes, the opposite may be true.
 Make your own considerations and act accordingly. That being said, there are a few general rules of etiquette which you should strive towards. It may not always be possible to comply to all of them, but you should at least try.
 0. **Don't make a patch if you don't need to.** Your reasoning for writing a patch may be as simple as noting that it would be more efficient; just, please, ensure that it's a good one.
 1. **Don't delete nodes.** Deleting nodes will obviously make all pattern matching in the area fail; however, that's not necessarily something you want. Some less-than-clever loaders will crash if their pattern matching equivalent fails, and if you have no guarantee your environment will be clear of them, and in those cases you should be mindful of invasiveness but not necessarily of fragility. Returning early or jumping over it are typically better alternatives (though in some specific, performance-critical parts it may not be viable).
 2. **Use pattern matching over position matching.** Some people like to find their instruction node by going a fixed amount of nodes down the list. Those people are stupid. That's a surefire way to write code rigged to explode in any environment with multiple patchers, *even if those patches aren't touching the same part of the method*.
 3. **If you are adding a lot of instructions, consider using a method instead.** As we've seen, it's perfectly doable to call a method, and with the processor it's especially easy. So, if you feel like you are adding too many nodes, write that in Java in a static method and add a call to it. The performance hit from a function is typically negligible, and it will spare you a number of mistakes, and also potential problems arising from confusing other people's `PatternMatcher`s.
 4. **Be mindful of bloating critical functions.** Consider this an appendix to the previous one. The performance hit from a few extra opcodes isn't going to matter, nor is one for an extra function call *in itself*. It may however matter if your function has O(n³) complexity and is called hundreds of times every second. Use your brain.
 	- Traditionally, especially in Minecraft, patches have been used to emit events in certain parts of the code; other, higher-level parts of the mod will then subscribe to them, and run some function when they happen. This is not a bad design in any way, *if implemented sensibly*, but please be careful in adding events. In my opinion, you should only go for an event if you are very sure that you'll need to execute custom behaviour there multiple times. Even then, be very careful not to make the functions that execute on event calls too expensive.
 5. **Failure is better than a misfire.** A patch accidentally applied in the wrong spot can do damage, and if you are unlucky it may be in hard-to-detect ways. In almost all cases, it's better for the pattern matching to fail than to cause unexpected behaviour.
--- a/src/2_patching/writing/patterns.md
+++ b/src/2_patching/writing/patterns.md
@ -0,0 +1,60 @@
 # Pattern Matching
 Take the following example:
 ```java
 public boolean controlFlag = true;
 private int counter = 0;
 public void incrementCounterConditionally() {
 	// assume some other code here
 	if(this.controlFlag) {
 		this.counter++;
 	} else {
 		this.counter--;
 	}
 	// assume some other code here
 }
 ```
 You are supposed to stop the counter from ever *decrementing*, but allow it to increment just fine. The assumption that no other piece of code will alter `counter` directly stands, but the same cannot be said for `controlFlag`. There are a few ways to approach the problem; however, this time, you're going to have to change code that is neither at the start, nor at the end of the method.
 This is why you need **pattern matching**. It is a feature of Lillero and the primary tool by which you will be patching. Its primary implementation class, the [`PatternMatcher`](https://docs.zaaarf.foo/lillero/ftbsc/lll/utils/PatternMatcher.html), reads through the method and attempts to identify sequences of opcodes satisfying user-specified parameters. If used correctly, this also doubles as a validity check: in most cases, you should structure your pattern matches in a way that they will fail if - and only if - the area you're targeting has been touched by others. This is not always possible, but you should strive towards it.
 Assuming that you can now reach any position in the method and add new opcodes in it (we'll see how in a minute), we now have to wonder about how, exactly, we can implement this change. Here are three examples:
 - Disable or nullify the decrementing in some way.
 	- This might be ideal in some circumstances; it's could certainly be the least invasive option, depending on how you implement it. However, it's likely not going to be the most efficient one.
 - Delete the decrementing altogether.
 	- Don't do this. Deleting opcodes, especially more than one, is extremely invasive and fragile.
 - Rig the if check so that it will never be false.
 	- This is the most efficient option. It might be more or less invasive that the first one, depending on how you implement it, but people shouldn't be matching against whole blocks anyway unless they intend to change them entirely.
 Speaking strictly of the best solution, I would personally choose the third one: it's elegant, efficient and unlikely to fail. Just add a `POP` and an `ICONST_0` before the `IFNE` call. However, for the purposes of our pattern matching example, let's assume that we chose to proceed with the first one. Once again, there are multiple approaches we can consider. Here are few: 
 - Immediately increment the value after decrementing it.
 	- This is the least invasive option. It has a performance hit compared to the original, but if you don't care about that (it's very negligible), it will quietly undo the decrement probably without bothering other patchers. However, I would argue that it's quite fragile, as its outcome depends on a previous state; I would not recommend this.
 - Replace the constant that's being summed with a 0.
 	- This is quite invasive, as you are inserting in the middle of an operation that is not separated in the higher-level code, but it is the most performant option this side of the `if` check. This will make attempts to match patterns (see below) on that `this.counter++` fail, which may or may not be desirable to you.
 - Unconditionally jump over the decrement.
 	- This is only mildly invasive, and has pretty good performance: it could be a valid option (only in this hypothetical world where you can't rig the `if` check, otherwise you should probably do that).
 Let's assume that you opted for the last option. Not because it's necessarily the best one, but it's the one that is most useful to showcase the what you should and shouldn't do. In order to apply this patch, you'll need a `GOTO` just before the `this.counter++`, and its matching label immediately after. As `this.counter++` is actually a sequence of multiple opcodes, this is less trivial than it seems.
 Here is how you match such a sequence with the `PatternMatcher`:
 ```java
 InsnSequence matchedSequence = PatternMatcher.builder()
 	.opcodes(ALOAD, DUP, GETFIELD, ICONST_1, IADD, PUTFIELD)
 	.ignoreLineNumbers()
 	.ignoreLabels()
 	.ignoreFrames()
 	.build()
 	.find(method);
 ```
 As matching linenumbers, labels and frames is quite unreliable, it is typically a good idea to ignore them. You can now `insertBefore` the first node of the sequence and `insert` after the last one to get the desired result.
--- a/src/2_patching/writing/proxies.md
+++ b/src/2_patching/writing/proxies.md
@ -0,0 +1,2 @@
 # Proxies
 TODO
--- a/src/SUMMARY.md
+++ b/src/SUMMARY.md
@ -4,20 +4,25 @@
 	- [Why (not) Mixin?](./1_introduction/why_mixin.md)
 	- [Why Lillero?](./1_introduction/why_lillero.md)
 	- [Your Toolbox](./1_introduction/toolbox.md)
- [Patching Methods](./2_patching/patching.md)
+- [Patching](./2_patching/patching.md)
 	- [Bytecode](./2_patching/bytecode/introduction.md)
 		- [Stack-oriented Programming](./2_patching/bytecode/stack.md)
 		- [Examples](./2_patching/bytecode/examples.md)
-	- [Nodes](./2_patching/nodes.md)
+	- [Nodes](./2_patching/nodes/introduction.md)
-		- [Jump Nodes](./2_patching/jump_nodes.md)
+		- [Non-Instruction Nodes](./2_patching/nodes/non_instruction.md)
-		- [Invoke Dynamic Nodes](./2_patching/jump_nodes.md)
+		- [Jump Nodes](./2_patching/nodes/jump.md)
-		- [Integer Nodes](./2_patching/jump_nodes.md)
+		- [Invoke Dynamic Nodes](./2_patching/nodes/invokedynamic.md)
-		- [Integer Increment Nodes](./2_patching/jump_nodes.md)
+		- [Integer Nodes](./2_patching/nodes/integer.md)
-		- [LDC Nodes](./2_patching/jump_nodes.md)
+		- [Integer Increment Nodes](./2_patching/nodes/integer_increment.md)
-		- [Lookup Switch Nodes](./2_patching/jump_nodes.md)
+		- [LDC Nodes](./2_patching/nodes/ldc.md)
-		- [MultiANewArray Nodes](./2_patching/jump_nodes.md)
+		- [Lookup Switch Nodes](./2_patching/nodes/lookup_switch.md)
-		- [Method Nodes](./2_patching/jump_nodes.md)
+		- [MultiANewArray Nodes](./2_patching/nodes/multi_a_new_array.md)
-		- [Table Switch Nodes](./2_patching/jump_nodes.md)
+		- [Method Nodes](./2_patching/nodes/method.md)
-		- [Type Nodes](./2_patching/jump_nodes.md)
+		- [Table Switch Nodes](./2_patching/nodes/table_switch.md)
-		- [Var Nodes](./2_patching/jump_nodes.md)
+		- [Type Nodes](./2_patching/nodes/type.md)
-	- [Pattern Matching](./2_patching/patterns.md)
+		- [Var Nodes](./2_patching/nodes/var.md)
 	- [Writing Patches](./2_patching/writing/basic.md)
 		- [Pattern Matching](./2_patching/writing/patterns.md)
 		- [Proxies](./2_patching/writing/proxies.md)
 		- [Guidelines](./2_patching/writing/guidelines.md)
 		- [Mitigating Collisions](./2_patching/writing/collisions.md)