Skip to main content


Studying the Impact of Adopting Continuous Integration on the Delivery Time of Pull Requests



PAPER INFORMATIONS


João Helis Bernardo
IFRN, UFRN, Brazil


Daniel Alencar da Costa
Queen's University, Canada


Uirá Kulesza
UFRN, Brazil



In this work, we analyze 162,653 Pull Requests (PRs) of 87 projects implemented in 5 different programming languages from GitHub to empirically investigate the impact of switching to Continuous Integration (CI) on the time-to-deliver of merged pull requests.

In particular, we address the following research questions:

RQ1: Are merged pull requests released more quickly using continuous integration?

RQ2: Does the increased development activity after adopting CI increase the delivery time of pull requests?

RQ3: What factors impact the delivery time after adopting continuous integration?

Full Paper Preview and Download

Results


Here we present the results for each RQ that we address.

RQ1: Are merged pull requests released more quickly using continuous integration?

Interestingly, we find that the time to deliver PRs is shorter after the adoption of CI in only 51.3% of the projects. In addition, we find that in 62 (62 /87) of the studied projects, the merge time of PRs is increased after adopting CI.

RQ1: See Mann-Whitney-Wilcoxon test result and Cliff's delta values



RQ2: Does the increased development activity after adopting CI increase the delivery time of pull requests?

We find that there exists a considerable increase in the number of PR submissions and in the churn per releases after adopting CI. The increased PR submissions and churn are key reasons as to why projects deliver PRs more slowly after adopting CI. 71.3% of the projects increase the rate of PR submissions after adopting CI.

RQ3: What factors impact the delivery time after adopting continuous integration?

Our models indicates that, before the adoption of CI, the integration-load of the development team, i.e., the number of submitted PRs competing for being merged, is the most impactful metric on the delivery time of PRs before CI. On the other hand, our models reveal that after the adoption of CI, PRs that are recently merged in a release cycle are likely to have a slower delivery time.

RQ3: R² and R² optimism


RQ3: Explanatory Power of the models' variables (Before and After CI)

Datasets


For replication purpose, we publicize our datasets to the interested reader.


The table below detail the meta-data that we fetch for a total of 162,653 pull requests of our dataset.


#VariableDefinition
1projectFull name of the GitHub repository
2languageProgramming language used by the project
3pull_idPull request ID
4pull_numberPull request number
5commits_per_prNumber of commits per PR.
6changed_filesThe number of files linked to a pull request submission
7churnThe number of added lines plus the number of deleted lines to a pull request
8commentsThe number of comments of a pull request
9comments_intervalThe sum of the time intervals (days) between comments divided by the total number of comments of a pull request
10merge_workloadThe amount of PR that were created and still waiting to be merged by a core integrator at the moment at which a specific pull request is submitted
11description_lengthThe number of characters in the body (description) of the PR
12contributor_experience The number of previously released pull requests that were submitted by the contributor of a particular PR. We consider the author of the pull request to be its contributor
13queue_rankThe number that represents the moment when a pull request is merged compared to other merged PRs in the release cycle. For example, in a queue that contains 100 PRs, the first merged PR has position 1, while the last merged pull request has position 100
14contributor_integrationThe average in days of the previously released PRs that were submitted by a particular contributor
15stacktrace_attached We verify whether the pull request report has an stack trace attached in its description
16activitiesAn activity is an entry in the pull request' history
17merge_timeNumber of days between the submission and merge of a pull request
18delivery_delayNumber of days between the merge and the delivery of a pull request
19practice We verify whether a pull request was submitted before or after the adoption of CI

Use the link below to download all the pull requests of our dataset and their respective meta-data in a format of a CSV document.



We also publicize a second dataset that contains all the releases (7,495) and their respective meta-data for each of the 87 studied projects.


The table below detail all the meta-data that we fetch for the releases.



Use the link below to download all the releases of our dataset and their respective meta-data in a format of a CSV document.