An Effective Regression Testing Approach for PHP Web Applications

INTRODUCTION Web applications change and are upgraded frequently due to security attacks, feature updates, or user preference changes. These fixes often involve small patches or revisions, but still, developers and testers need to perform regression testing on their products to detect whether these changes have introduced new faults. Applying regression testing to the entire product, however, can require a lot of time and resources [1], and for these applications, a short turnaround time in releasing patches is critical because the applications have already been deployed and used in the field, thus users could suffer a great deal of inconvenience due to the absence of the services on which they rely. For instance, one study [2] reports that small and medium-sized organizations experienced an average loss of $70,000 per downtime hour. Further, organizations that provide those applications could also suffer from losing customers or from damaged reputations if they do not supply patch releases in time. One solution to this problem is to focus only on the areas of code that have been changed and regression test them. In this way, companies can deliver quick patches more dependably whenever they encounter security breaches. To date, the majority of regression testing approaches have focused on utilizing existing test cases (e.g., [3], [4], [5]), but to test updated features or new functionalities throughly, we need new test cases that can cover areas that existing test cases cannot. Creating new, executable test cases for affected areas of the code, which is known as a test suite augmentation problem [6], is one of the important regression testing problems, but only a few researchers have started working on this problem [6], [7], [8], [9]. While their work has made some progress in this area, they have only provided guidance for creating new tests [7], generated new test cases limited to numeric values [9], and considered only small desktop applications. However, web applications involve different challenges than desktop applications written in C or Java [10], and the majority of web applications heavily deal with strings in addition to numeric values. In addition, while Taneja et al. [8] propose an efficient test generation technique by pruning unnecessary paths, the dynamic symbolic execution-based test generation approach used by other researchers can still be expensive and infeasible when we apply it to large size applications. To address these limitations, we propose a new test case generation approach that creates executable test cases using program slices considering both string and numeric input values. In particular, we focus on web applications written in PHP which is widely used to implement web applications [10]. To do so, the following steps are required: (1) Identifying the areas of the application impacted by the patches and (2) Generating new test cases for the impacted areas of code by using program slices and considering both string and numeric input values. Program slices have been used for many purposes, such as helping with debugging processes [11], test suite reduction [12], test selection [13], test case prioritization [14], or fault localization [15], but to our knowledge, no attempt, except for work by Samuel and Mall [16] (whose work uses UML diagrams rather than source code), has been made to generate test cases using slices. In this work, we will utilize slices when we generate test cases. To facilitate our approach, we implemented a PHP Analysis and Regression Testing Engine (PARTE [17]). To assess our approach, we designed and performed a controlled experiment using open source web applications. Our results showed that our approach is effective in reducing the cost of regression testing for frequently patched web applications. In the next section of this paper, we describe our overall methodology and the technical details about PARTE. Section III presents our experiment design, results, and analysis. Section V discusses our results. Section VI describes related work relevant to web applications and regression testing, and Section VII presents conclusions and future work.