removal of the unpackclass_ and loader_ prefixing hack
the NameMap may be used to remap common unpackclass and client classes to the same name (e.g. the buffer, node and bzip2 classes)
(possibly as a long-term, low-priority goal) deobfuscating the unsigned signlink at the same time as the signed one, which will require joining up inherited member disjoint sets across libraries
Unfortunately it's going to require a massive change across everything that touches ASM in the codebase.
An initial rough plan is:
give each Library a name (as a side effect, we can use this to improve the client detection in the static scrambling code)
allow libraries to depend on other libraries
add a special RuntimeLibrary for representing the standard JDK classes
add the library name to MemberRef
remove support for grabbing a class by name from ClassPath - instead, this must always go through a Library which can resolve the class by looking through its transitive dependencies
add library names to the @OriginalXXX annotations (or can we rely on the implicit names based on the .yaml file they're in?)
This will allow:
* removal of the `unpackclass_` and `loader_` prefixing hack
* the `NameMap` may be used to remap common `unpackclass` and `client` classes to the same name (e.g. the buffer, node and bzip2 classes)
* (possibly as a long-term, low-priority goal) deobfuscating the unsigned signlink at the same time as the signed one, which will require joining up inherited member disjoint sets across libraries
Unfortunately it's going to require a massive change across everything that touches ASM in the codebase.
An initial rough plan is:
* [x] give each `Library` a name (as a side effect, we can use this to improve the client detection in the static scrambling code)
* [x] allow libraries to depend on other libraries
* [x] add a special RuntimeLibrary for representing the standard JDK classes
* [ ] add the library name to `MemberRef`
* [ ] remove support for grabbing a class by name from `ClassPath` - instead, this must always go through a `Library` which can resolve the class by looking through its transitive dependencies
* [ ] add library names to the `@OriginalXXX` annotations (or can we rely on the implicit names based on the `.yaml` file they're in?)
ClassPath should probably be renamed to something like LibrarySet in the future, as each individual library will effectively manage its own classpath.
`ClassPath` should probably be renamed to something like `LibrarySet` in the future, as each individual library will effectively manage its own classpath.
ClassMetadata::dependency could also be renamed, now its meaning has changed such that it is only true if a class is from the runtime.
... or do we still need something like it? It is important to block renaming of overriden library methods that won't be remapped (as the library isn't in the ClassPath).
Perhaps remap() should walk the tree of libraries recursively?
`ClassMetadata::dependency` could also be renamed, now its meaning has changed such that it is only `true` if a class is from the runtime.
... or do we still need something like it? It is important to block renaming of overriden library methods that won't be remapped (as the library isn't in the ClassPath).
Perhaps remap() should walk the tree of libraries recursively?
Maybe the inherited member disjoint sets should be more aware of field/method resolution.
This would allow us to:
remove the inherited field disjoint set entirely
only include declared methods in the inherited method disjoint set
Maybe the inherited member disjoint sets should be more aware of field/method resolution.
This would allow us to:
* remove the inherited field disjoint set entirely
* only include declared methods in the inherited method disjoint set
I've realised that the latter won't work, as field/method resolution is not deterministic - which is particularly problematic for interfaces (where two interfaces implemented by one class might contain a method with the same name and therefore must be renamed together).
I've realised that the latter won't work, as field/method resolution is not deterministic - which is particularly problematic for interfaces (where two interfaces implemented by one class might contain a method with the same name and therefore must be renamed together).
Another alternative to the above: take the class prefixing hack further and prefix everything as we read it and strip the prefixes just before we write it.
So we'd have a sequence like:
a => client$a => client$Class1 => client$Node => Node (not ideal as it conflicts with inner classes, so the log messages might cause confusion)
or:
a => client/a => client/Class1 => client/Node => Node (which would make renaming packaged classes easier, but doesn't stand out as much as being "special" syntax)
Perhaps combining both, we could have something like $client/a, client$/a or $client$/a. $client$/a perhaps stands out the most.
This means we wouldn't need to change the guts of the ASM library, bundler and deobfuscator: just the first and last stages of the deobfuscator.
I like this better than only renaming a selection of the libraries (as we do now): it becomes unambiguous.
The @OriginalXXX syntax could also understand the names, and extract the library part out into a separate field.
It also means we could remove the Library class entirely (except for reading/writing classes?) and the ClassPath could store a list of ClassNodes directly. This would allow us to remove an argument from a bunch of transformer methods and also remove a level of nesting from a bunch of loops.
Another alternative to the above: take the class prefixing hack further and prefix _everything_ as we read it and strip the prefixes just before we write it.
So we'd have a sequence like:
`a` => `client$a` => `client$Class1` => `client$Node` => `Node` (not ideal as it conflicts with inner classes, so the log messages might cause confusion)
or:
`a` => `client/a` => `client/Class1` => `client/Node` => `Node` (which would make renaming packaged classes easier, but doesn't stand out as much as being "special" syntax)
Perhaps combining both, we could have something like `$client/a`, `client$/a` or `$client$/a`. `$client$/a` perhaps stands out the most.
This means we wouldn't need to change the guts of the ASM library, bundler and deobfuscator: just the first and last stages of the deobfuscator.
I like this better than only renaming a selection of the libraries (as we do now): it becomes unambiguous.
The `@OriginalXXX` syntax could also understand the names, and extract the library part out into a separate field.
It also means we could remove the `Library` class entirely (except for reading/writing classes?) and the `ClassPath` could store a list of `ClassNode`s directly. This would allow us to remove an argument from a bunch of transformer methods and also remove a level of nesting from a bunch of loops.
This will allow:
unpackclass_
andloader_
prefixing hackNameMap
may be used to remap commonunpackclass
andclient
classes to the same name (e.g. the buffer, node and bzip2 classes)Unfortunately it's going to require a massive change across everything that touches ASM in the codebase.
An initial rough plan is:
Library
a name (as a side effect, we can use this to improve the client detection in the static scrambling code)MemberRef
ClassPath
- instead, this must always go through aLibrary
which can resolve the class by looking through its transitive dependencies@OriginalXXX
annotations (or can we rely on the implicit names based on the.yaml
file they're in?)ClassPath
should probably be renamed to something likeLibrarySet
in the future, as each individual library will effectively manage its own classpath.ClassMetadata::dependency
could also be renamed, now its meaning has changed such that it is onlytrue
if a class is from the runtime.... or do we still need something like it? It is important to block renaming of overriden library methods that won't be remapped (as the library isn't in the ClassPath).
Perhaps remap() should walk the tree of libraries recursively?
Maybe the inherited member disjoint sets should be more aware of field/method resolution.
This would allow us to:
I've realised that the latter won't work, as field/method resolution is not deterministic - which is particularly problematic for interfaces (where two interfaces implemented by one class might contain a method with the same name and therefore must be renamed together).
Another alternative to the above: take the class prefixing hack further and prefix everything as we read it and strip the prefixes just before we write it.
So we'd have a sequence like:
a
=>client$a
=>client$Class1
=>client$Node
=>Node
(not ideal as it conflicts with inner classes, so the log messages might cause confusion)or:
a
=>client/a
=>client/Class1
=>client/Node
=>Node
(which would make renaming packaged classes easier, but doesn't stand out as much as being "special" syntax)Perhaps combining both, we could have something like
$client/a
,client$/a
or$client$/a
.$client$/a
perhaps stands out the most.This means we wouldn't need to change the guts of the ASM library, bundler and deobfuscator: just the first and last stages of the deobfuscator.
I like this better than only renaming a selection of the libraries (as we do now): it becomes unambiguous.
The
@OriginalXXX
syntax could also understand the names, and extract the library part out into a separate field.It also means we could remove the
Library
class entirely (except for reading/writing classes?) and theClassPath
could store a list ofClassNode
s directly. This would allow us to remove an argument from a bunch of transformer methods and also remove a level of nesting from a bunch of loops.I've gone with the prefixing solution - see b1bc7377fce8f80f8aee19efd5cca1a6e48dd76d