Debugging guide
In this tutorial, we introduce the way to debug the workflow and the training algorithms.
Workflow Debugging (--backbone=debug)
- Install VSCode and connect to GPU server.
VSCode is a open-source software and provides all debugging plugins for free. Therefore, we choose VSCode as our debugging platform.
VSCode can connect to remote ssh server and operate it as if it is your local machine. For more details, please refer to VSCode official documents.
-
Install VSCode Python Extension Bundle
-
Create
.vscode/launch.json. If.vscodedoes not exists yet, create it. -
Copy and paste the following configuration into
launch.json
{
"version": "0.2.0",
"configurations": [
{
"name": "Python Debugger: Launch rollout",
"type": "debugpy",
"request": "launch",
"module": "ajet.launcher",
"console": "integratedTerminal",
"args": [
"--backbone", "debug",
"--conf", "./path/to/yaml.yaml"
],
"env": {}
}
]
}
-
Modify
./path/to/yaml.yamlfield to your task yaml. -
For more sophisticated task with additional external service, add env variables or more args. For example, if your original training command is:
export DASHSCOPE_API_KEY="sk-abcdefg"
ajet --conf tutorial/example_appworld/appworld.yaml --with-appworld --backbone='verl'
Then, the modified launch.json will be
{
"version": "0.2.0",
"configurations": [
{
"name": "Python Debugger: Launch rollout",
"type": "debugpy",
"request": "launch",
"module": "ajet.launcher",
"console": "integratedTerminal",
"args": [
"--backbone", "debug", // verl -> debug
"--conf", "tutorial/example_appworld/appworld.yaml",
"--with-appworld",
],
"env": {
"DASHSCOPE_API_KEY": "sk-abcdefg"
}
}
]
}
-
Press
F5to start debugging. -
You can set breakpoint inside the workflow to observe program execution now.
General Debugging (Ray Distributed Debugger)
-
Install the Ray Distributed Debugger extension in VSCode.
-
In AgentJet project:
2-1. In the place your want to set a conditional breakpoint, write
from ajet import bp; bp("TAG_1")
2-2. When launching the training process, add --debug as commandline argument
ajet --conf your_config.yaml --debug="TAG_1"
2-3. Open Tab "Ray Distributed Debugger" in VSCode, and just wait until the breakpoint is hit.
Comparison
| Feature | Workflow Debugging | General Debugging (Ray) |
|---|---|---|
| Backend | debug, tinker |
verl, trinity |
| Reboot Speed | Very Fast | Slow |
| Debug Target | Workflow | Everything |
| VSCode Extension | Python | Python + Ray Distributed Debugger |
| Launch Mode | F5 standard launch (via launch.json) |
Command line execution with ajet ... --debug="TAG" |
| Commandline | --backbone=debug |
--debug="TAG1\|TAG2\|TAG3" |