Orion.md (3815B)
1 # Orion 2 3 **Source:** Orion: Fuzzing Workflow Automation 4 5 ## Notes 6 7 This is an approach to fuzzing automation presented in the 'Orion: Fuzzing Workflow Automation' paper by employees of Nvidia. 8 9 ## Process 10 11 - Harness generation and execution 12 - Takes target project source code as input 13 - Constructs a codebase index 14 - The codebase is chunked on the basis of functions 15 - Select interfaces for fuzzing by ranking non-static functions by how likely it thinks fuzzing will trigger bugs 16 - This ranking is done by computing a few metrics: 17 - Cyclomatic complexity 18 - Number of independent paths paths through a function 19 - Internal function calls 20 - How frequently a function is called by others 21 - Lines of code 22 - Callgraph size 23 - Number of functions reachable from the given function 24 - This seems to be an attempt to place more weight on more important functionallity which might be misguided. 25 - If there is a part of the codebase that is very integral, it seems likely that is more well tested. 26 - Dangerous expressions 27 - Constructs like pointer arithmetic, memory maangement, bit operations (detected by the LLM...) 28 - Honestly, I'd just use regex... 29 - Sink functions 30 - Functions associated with vulnerabilities 31 - Again... why are you using LLMs!? Just use regex 32 - Parsing functions 33 - Functions that parse structured inputs 34 - fair play for LLM usage. 35 - Generates seed inputs for each selected function 36 - Generate harness 37 - A dependency analysis agent identifies setup and teardown processes and header file dependencies 38 - Constructs compilable fuzz drivers compatible with generated seeds 39 - The resulting harnesses and seeds are then sent to the fuzzing infra which executes the fuzzer, monitors for errors, and records the results 40 - Crash handling 41 - The triage agent filters out harness-related issues 42 - Triage agent root causes issues and creates minimal repros 43 - Patching 44 - Patching agent patches the issue 45 - Minimal repros are validated as fixed from the prior step 46 - If the issue still exists, patching agent tries again 47 48 ## Questions During Reading 49 50 - How do they ensure the bug triaging / patching doesn't result in further regressions? 51 - It seems likely they are finding real issues, but the patches would be dubious, even if they result in the problem being resolved for a given byte buffer input to the harness. 52 - Towards the end of the paper they say basically a human reviews at the end and they ensure it passes minimal tests 53 - This seems like basically they just validate it compiles and passes the fuzz repros 54 - Why does parsing the source code to create call grpahs and type indexes improve performance? 55 - Agentic models should be able to request this information themselves with tool use without the need for preprocessing. 56 - This paper was put on ArXiv on Sep. 18th so they clearly had access to tool using agentic models that could easily do this. 57 - Also, I don't see ablations so I don't know if it does. 58 - How did they choose the target identification metrics? 59 60 ## Interesting Questions to Explore 61 62 - Does generating call graphs improve model performance? 63 - What about other forms of codebase processing for context generation? 64 - What are the most important metrics for deterministic risk identification? 65 66 ## Important Takeaways 67 68 - They define seeds prior to their harnesses. 69 - Their rationale seems to be if they constrain the input and output spaces, LLMs will be better at generating harnesses.