Frida Python Multiprocessing: Solving Hanging Code Issues
When working with the Frida instrumentation toolkit in a Python multiprocessing environment, you may encounter issues due to the way modules are loaded and managed across processes. This article provides a detailed guide on how to properly use Frida with Python’s multiprocessing module, avoiding some bad surprises.
What Is Wrong
The core issue arises from the way Frida behaves in python multiprocessing. Frida, when imported at one process and later new process is started some methods will hang such as frida.get_device_manager().add_remote_device(dev_name)
, especially if the default fork start method is used. This could because fork copies the parent process’s memory space to the child process, leading to conflicts.
How Python Imports Work
In Python, when a module is imported, it is loaded into memory, and its top-level code is executed. This happens only once per process, and the module is then cached. Subsequent imports of the same module in the same process will fetch the module from the cache, avoiding re-execution.
Solution Overview
To avoid these issues, we can:
- Use the spawn start method for multiprocessing.
- Import Frida within the target function rather than at the top of the script.
Example
Here’s a step-by-step example explanation:
1. Importing Modules
We start by importing the required modules: multiprocessing and importlib.
1
2
import multiprocessing as mp
import importlib
2. Defining the Target Function
The target function get_frida_device
imports Frida within the function scope and reloads it to ensure proper initialization. This function also demonstrates adding and removing a remote device using Frida.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
def get_frida_device(called_from: str = None):
import frida # Import frida here to avoid issues
from frida.core import Device, Session, Script
importlib.reload(frida)
if called_from:
print(f"Called from {called_from}")
dev_name = "192.168.50.92:27042"
print("Before get_frida_device")
# This will get stuck if called in a subprocess without proper handling
device = frida.get_device_manager().add_remote_device(dev_name)
print(device)
print("After get_frida_device")
frida.get_device_manager().remove_remote_device(dev_name)
3. Setting the Multiprocessing Context
We set the multiprocessing context to spawn to avoid issues with the ‘fork’ method.
1
ctx = mp.get_context('spawn') # Use 'spawn' to avoid issues
4. Creating and Starting the Process
We create a new process targeting the get_frida_device
function and start it.
1
2
3
p = mp.Process(target=get_frida_device)
p.start()
p.join()
5. Calling the Function in the Main Process
Finally, we call the get_frida_device
function in the main process for comparison.
1
get_frida_device("main")
Full Code Listing
Below is the complete script that incorporates the steps mentioned above: to reproduce you can move the import statements to the top of the file and/or using fork multiprocessing context.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
import multiprocessing as mp
import importlib
def get_frida_device(called_from:str = None):
import frida # import frida here to avoid the issue
# import frida # don't import at the top of the file because it will break the multiprocessing
from frida.core import Device, Session, Script
importlib.reload(frida)
if called_from:
print(f"called from {called_from}")
dev_name = "192.168.50.92:27042"
print("before get_frida_device")
# if in sub process, it will stuck here
device = frida.get_device_manager().add_remote_device(dev_name)
print(device)
print("after get_frida_device")
frida.get_device_manager().remove_remote_device(dev_name)
ctx = mp.get_context('spawn') # use 'spawn' to avoid the issue
# ctx = mp.get_context('fork') # fork may cause the issue too
p = mp.Process(target=get_frida_device)
p.start()
p.join()
get_frida_device("main")
Key Points to Remember
- Import Location Matters: Import Frida within the function that will use it, not at the module level when using multiprocessing.
- Use Spawn Method: The ‘spawn’ start method creates a fresh Python interpreter process, avoiding memory space conflicts.
- Reload Module: Using
importlib.reload(frida)
ensures proper initialization in each process. - Avoid Fork: The ‘fork’ method can cause hanging issues with Frida due to memory space copying.
Final Words
With an understanding of this insidious issue and equipped with information on how to address and avoid it until it is permanently resolved by the developers you will have easy time instrumenting your next application.
Stay Curious.