Intent-based Extensible Real-time PHP Supervision Framework

INTRODUCTION AND MOTIVATION he growing number of web applications makes them an attractive target for attacks due to the large amount of private user data accessed, processed and stored by these applications. Various protection techniques have been developed to secure these popular web applications, and ensure the security and privacy of the data. Existing web application protection techniques may be categorised in different ways. For example, some protection approaches define specific development practices, such as domain-specific languages or third-party APIs [3][4][5][6]. Such practices are mostly applicable to new software. Other protections focus on legacy web applications [9][10][11]. In practice, web applications commonly evolve over time as user preference and demands change. Thus, hybrid protection approaches for both existing legacy web applications and further application development would be beneficial. A conceptual difference between protection methods lies in priorities assigned to viewpoints and intentions of application developers and protection developers. Some approaches consider application developer’s intent to a certain degree by analysing the application source code [7][8] or by observing application behaviour at run time. Other approaches, however, may overlook application developers’ intentions, and this is common in rule-based protections where protection developers or administrators decide how the application should behave. Such enforced protection may seem to eliminate the need to trust the web application, by explicitly disabling unwanted application functionality, but a malicious web application developer may implement a back door bypassing the protection if the rule set employed is known. We propose an extensible real-time supervision framework for PHP web applications, designed for use in an Intrusion Prevention System (IPS). Other applications may include web application profiling, optimisation and debugging, and here we focus on the supervision processes. The framework supervises the web application execution to ensure that it is behaving as intended by the application author, plus allowing enforcement of behaviour determined by the protection administrator. With enforced-only protections, in-built protection methods may be duplicated or some intended application functionality might become unavailable. For example, a naïve proxy implementation to prevent SQL injection attacks by dropping or sanitising user input would be redundant if the web application already performs the same procedures. Also, some database management web applications (such as PHPMyAdmin) intentionally provide a way to submit any SQL queries, including characters commonly treated as dangerous (e.g. ‘;’). Ignoring the purpose and functionality of such management applications and blindly using a sanitising proxy may prevent correct execution of some legitimate SQL requests. Some protection approaches acknowledge this limitation in servicing potentially dangerous management applications [1][16]. To solve the limitations of redundant protections and reduced functionality, the proposed framework focuses on obtaining a deep understanding of the intended web application behaviour with source code analysis. PHP is a highly popular web application development language [12], so the framework focuses on PHP. Some notions, such as intent graph, may be applied to other web application development languages. The framework is focused on the server side and requires no client assistance, since malicious users can disable client-side code execution. The main contributions of the approach are its ease of deployment, and that it works without altering existing source code. The rest of the paper is organised as follows. Section II reviews the related work. The general architecture of the proposed framework and the individual framework components are discussed in Sections III and IV. Intrusion detection-related aspects and procedures are presented in Sections V and VI. Performance considerations are discussed in Section VII. In Section VIII, we discuss the implications of our findings as well as outlining future research opportunities.

RELATED WORK While rule-based protections may be easier to implement, given the open nature of scripting languages used in web development, securing web applications is commonly based on source code analysis. Static source code analysis allows the detection of vulnerabilities before application deployment and execution, however detected issues must be addressed before the application can be used in a production environment. Pixy [13] is an example of this type of protection designed to detect SQLI and XSS vulnerabilities. Reports generated by such tools may be intended to be analysed by the application developer, and require the developer to manually fix the detected issues. In some cases, however, automated patching may be performed as well [15]. In general, however, such tools are not strictly speaking protections, but are vulnerability scanners. A more detailed review appears in [14]. Taint-tracking analysis is a type of source code analysis commonly used to detect user input misinterpreting [8]. For example, restricting the user input entered on the web page from being mistakenly interpreted and executed as SQL can protect against SQL injection attacks. The popularity of such an approach is attributable to a relatively strict userapplication interaction protocol. User input typically arrives to a web application in HTTP GET/POST parameters, headers or cookies making it relatively easy to track input vectors. However, focusing only on the protocol-defined sources makes higher-order XSS attacks possible, as previously stored user input may be retrieved later from other sources. Dynamic analysis of various application aspects is also used to detect some types of vulnerabilities such as SQL injections [1][18]. In contrast to static analysis-based protection approaches common in vulnerability scanning, dynamic approaches can prevent actual attacks, having more control over application execution, and are typically used in attack detection and prevention. However, due to the higher requirements, dynamic approaches tend to focus on a subset of possible application actions. For example, BAXTEP allows prevention of shell command executions [17], which while useful, focus on only a narrow aspect of application behaviour and may not be enough to provide comprehensive protection, as large portions of application activity remain ignored. Due to the nature of source code analysis, web application protections are generally language-dependent. Moreover, some of the protections depend on the language version. For example, Pixy is not actively maintained and is compatible with an obsolete version of PHP (PHP4). In addition, objectoriented programming is not supported by Pixy. Another area of interest in addition to database-related activity and user input tracking is the structure of the web page generated. One approach [2] aims to approximate the expected web page as a grammar. While not a protection in itself, the predicted page structure may be used to verify whether the generated page is as expected, and if not, it may indicate an XSS attack. The inherent disadvantage of such page structure approximation lies in the low precision achieved in some highly dynamic cases. For example, if the output page structure can be heavily affected by the application users, large parts of the page may become unpredictable. While, in general, such web pages may be vulnerable to XSS attacks and typically need to be reworked, in some cases that may be intended behaviour. In such scenarios, a special form of code annotation may be needed to mark the intended potentially dangerous code [19]. Given the limitations of and diverse techniques used by the reviewed protection approaches, a unifying protection framework is proposed in this paper. Its purpose is to minimise the gap between general policy- and security-based approaches and to assist web application developers. The approach also detects control flow integrity violations, commonly applied in lower-level environments [20][21] and not previously used in the web application context. Despite the range of web application protection techniques and increased understanding of secure web development, successful largescale attacks against web applications still occur [23][24] and high-profile attacks are even being discussed in popular media [22][25]. Therefore, a universal framework extensible to suit various environments would be beneficial.