Auto DraftPath Sensitive Static Analysis of Taint-Style Vulnerabilities in PHP Code

INTRODUCTION Many web applications have been used for daily activities, such as banking and detailing. A large volume of valuable data are processed and stored by web applications, making them attractive targets of security attackers. Hacker may attack a web application successful if there are some vulnerabilities in it. According to the report of Risk Based Security [1], 2016 is the new all-time high with VulnDB reaching 15,000 vulnerabilities in which web application vulnerabilities accounted for 53.5%. To reduce the number of web application vulnerabilities, web application can be audited before it being deployed on a web server. PHP is the most popular scripting language on the web, and PHP code is prone to many types of critical security vulnerabilities, such as XSS [2, 3], SQL injection [4, 5], CSRF [6] etc. There is not a general algorithm to find out all types of web applications vulnerabilities for lacking a general model for them. Taint-style vulnerabilities are a special class of web application vulnerabilities [7, 8]. A vulnerability of this style means that tainted data from malicious users can cause security problems at vulnerable points in the program. XSS and SQL injection are the typical cases of taint-style vulnerabilities. Livshits et al. [7] presented a taint-style vulnerabilities model as a triple , the sources of which specify the ways the user to provide its data for the program, the sinks specify unsafe ways of the program to use user’s data, and the derivations specify the ways of data propagating between objects in the program. Taint data analysis is a major method for detecting taintstyle vulnerabilities. As for PHP codes, a static taint data analysis method has been first explored in WebSSARI [9] to find out taint-style vulnerabilities. Since then, many other tools were implemented [8, 10-12] and have been used to find many web application vulnerabilities in PHP scripts. All vulnerabilities reported by the analysis tool will be confirmed by a POC which is constructed by the user. The path information of vulnerability is very import for constructing POC effectively. But, these tools listed above lack enough path information, because the taint analysis algorithms implemented in these tools are not path-sensitive. We present a novel path-sensitive, interprocedural and context-sensitive static data flow analysis method for taintstyle vulnerabilities in PHP code, and implement it in a tool named POSE.

THE DETECTING TOOL–POSE We implement the path sensitive static analysis method in a tool named POSE (PHP cOde Static rEviewer). The framework of POSE is showed as Fig. 1. The module of PHP code preparing is implemented based on PHP Parser [13], and it will transfer the PHP code to AST, and then construct the CFG and CG of the PHP code. The module of sinks searching will collect all sinks of the PHP code according to the CG and sink definition which is stored in a file. The module of path searching implements path searching algorithm and it will search all paths of a context. In order to avoid endless loop of the process of path searching, all circle of CFG paths and function call relations will be excluded from the CFG and CG. The module of taint analysis implements the static taint analysis algorithm, and it will trace all variables along all paths from the sink to the start of the PHP code, and judge if there is taint-style vulnerability.

CONCLUSIONS Detecting web application vulnerabilities is research hotspot in recent years. Static taint data analysis is a critical method for detecting web application vulnerabilities. Web application vulnerabilities with high false positives reported by the static analysis method need confirm by a POC which is constructed by the user. The path information of vulnerability is very import for constructing POC effectively. But these tools, which implemented static data analysis methods so far, lacked enough path information, because the taint analysis algorithms implemented in these tools are not path-sensitive. A novel path-sensitive static analysis method proposed by this paper includes three key steps, the first of which is path searching in a basic block, the second is path searching between blocks, and the third is path searching crossing function call. A tool named POSE implements the new path sensitive static analysis method, and the testing results show the method is valid for taint-style vulnerabilities in PHP code. There are still high false positives when using static method to detect web application vulnerabilities, and some AI solutions may be proposed to deal with this situation.