Spaces:
Running
Tasks
TASK 1 Targeted Vulnerability Detection: (medium)
Initial Setup: Single Solidity File
Initial Observation: What the contract is about and what it is supposed to do
Objective: Vulnerable function and it's issue in 2-5 words or NO if no vulnerability exists
Data file contains 7-8 vulnerabilities so make sure to choose a random function every time you reset environment
Actions:
list_functions, -0.05 get code of a function, -0.10 if wrong else +0.05 get function summary of a function (comments), -0.05 if wrong else +0.03 Correct submission, +5 Wrong submission, -1.50 Repeated query, -0.4 Get file metadata, (header comment), -0.04 get state variable, -0.05 get call graph, -0.08 If unknown action negative score TASK 2 Property Discovery: (hard)
Initial Setup: One Function from a solidity file from a contract with known properties
Initial Observation: What the contract is about and what it is supposed to do
Objective: Property generated will be in Natural language related to the function
Actions:
get a similar rule/property from different contracts sol file, -0.2 submit a property (reward based on how close it is to original property) (0 - 5) one submission attempt per episode, determinstic checker get file natspec comments, -0.03 get function natspec comments, -0.08 get code of a function, -0.06 get related functions, -0.06 get input and output of the function, -0.04 (Changing it to get signature) TASK 3 Rule Checker (easy)
Initial Setup: One sol file with at least one function that's breaking the property
Initial Observation: Given a property in english (THE MAIN DIFFERENCE)
Objective: Identify rule breaking function
Actions:
Get formalized version of the property, -0.03 Get list of functions, -0.05 get function metadata, -0.05 get code of a function, -0.1 get state variables, -0.05 get call graph, -0.08 submit a function (subfunction of target = 1.5, wrong = -1.5, correct = 5) only one submission per episode Data From Projects by Certora:
Generated JSON File contains about 7 to 8 vulnerabilites from these 3 projects audit report by certora.
AaveVault AaveVaultV2 Lido Finance Report Problems:
PERHAPS I NEED TO TEST TO FIND THE RIGHT SET OF REWARDS FOR THE ACTIONS, ALSO MAX STEPS Does it makes sense to fetch one global invariant at a time, NO ACC TO LLM TWICE In vulnerability detection, action file header comment doesn't make much sense acc to LLM maybe call graph actions can be revised, like it can be more detailed, instead of getting entire graph at once, get it in parts. Vulnerability type doesn't have a particular name, so a function and a descripiton would have to do Use Sentence transformer library along with keyword matching to guess the similarity, can't use this MAYBE INCREASE SIZE OF THE DATASET getfunctioncode can broken into more simpler actions such as get parameters, what it returns get state variables action etc get property specification, doesn't seem very useful though can getting severity be a good action ?? does it help in identifying vulnerabliity, maybe we can give strict reference as to what is highly severe CURRENT:
Output format by agent is mentioned in system prompt.