Recent Advances in Computer Science and Communications

Author(s): Alok Aggarwal*, Vinay Singh and Narendra Kumar

DOI: 10.2174/2666255814666210621121914

A Rapid Transition from Subversion to Git: Time, Space, Branching, Merging, Offline Commits & Offline builds and Repository Aspects

Article ID: e060422194190 Pages: 8

  • * (Excluding Mailing and Handling)

Abstract

Background: Software development is the transition from centralized to decentralized version control systems. This transition is driven by the limited features of centralized version control systems in terms of branching, merging, time, space, offline commits & builds and repository aspects. Transition from Subversion; a centralized version control system, to Git; a decentralized version control system has been focused in a limited way.

Objective: In this work transition process from Subversion Version Control System (VCS) to Git VCS has been investigated in terms of time, space, branching, merging and repository aspects from the software developer’s point of view; working individually or in a large team over a large and complex software having a legacy of many decades. Experimentation was conducted in SRLC Software Research Lab, Chicago, USA.

Methods: Various scripts have been developed and executed for version control using Git and performed over a few legacy software.

Results: Results show that branching in Git and Subversion has a difference of about 39 times, i.e. branching operation of Git is about 39 times faster than Subversion. Merging in the case of Git is trivial and automatic, while Subversion needs a manual process of merging, which is error prone. Using an example of Mozilla with fsfs backend, it is observed that disk space can be saved up to 30 times in Git compared to Subversion. By taking a typical example of a large sized project it is observed that Git requires almost half of the revisions compared to Subversion, further with fsfs backend a project having ten years of history with 240,000 commits needs 240 directories in case of Subversion while Git requires only 2 directories. Using offline commits and offline builds of Git, it is observed that in Git whitespace changes, in contrast to significant business logic changes, can be staged in one commit only. These are not possible in Subversion, which requires a complicated system of diffing to temporary files. It is also observed that Git provides offline commit facility, i.e. in case if for some reason, remote repository is unavailable due to disaster or network failure, then still developers can commit their offline code and execute the offline build.

Conclusion: However, no previous study was found that focused on how the choice actually affects software developers and it forms the motivation for the present work. In this work, a list of how the choice between Git and Subversion affects software developers is worked out. Although software developers in many aspects are unaffected by the choice, few interesting findings were achieved. One of the most interesting findings of the proposed work is that software developers seem to publish their code to the main repository more often in Git than in Subversion. It is also found that the majority of the software developers perform at least two commits per push, which means that Git repositories will contain a lot more saved points in history than Subversion repositories.

Keywords: Version control system, distributed VCS, centralized VCS, transition, branching, merging, time, space.

Graphical Abstract

[1]
N.B. Ruparelia, "The history of version control", Softw. Eng. Notes, vol. 35, no. 1, pp. 5-9, 2010.
[http://dx.doi.org/10.1145/1668862.1668876]
[2]
B. De Alwis, and J. Sillito, "Why are software projects moving from centralized to decentralized version control systems?", ICSE Workshop on Cooperative and Human Aspects on Software Engineering (CHASE’09), 2009pp. 36-39
[http://dx.doi.org/10.1109/CHASE.2009.5071408]
[3]
D. Spinellis, "Git", IEEE Softw., vol. 29, no. 3, pp. 100-101, 2012.
[http://dx.doi.org/10.1109/MS.2012.61]
[5]
[6]
J. Loeliger, Matthew McCullough, “Version Control with Git: Powerful Tools and Techniques for Collaborative Software Development., 2nd ed O’Reilly Media, Inc., 2009.
[7]
C. Brindescu, M. Codoban, S. Shmarkatiuk, and D. Dig, "How Do Centralized and Distributed Version Control Systems Impact Software Changes?", In: Proc. 36th Int. Conf. Software Engineering, 2014, pp. 322-333. Hyderabad, India
[http://dx.doi.org/10.1145/2568225.2568322]
[8]
K. Mu, "lu, C. Bird, N. Nagappan, and C. Bird, “Transition from Centralized to Decentralized Version Control Systems: A Case Study on Reasons, Barriers, and Outcomes", In: Proc. Int. Conf. Software Engineering ICSE-2014, pp. 334-344. Hyderabad, India
[9]
Stefan, Otte, " Version Control Systems", Open Access Article, pp. 1-12, 2009.
[11]
[13]
[14]
A. Mockus, "Amassing and indexing a large sample of version control systems: towards the census of public source code history", In: Proc. 6th IEEE Inter. Working Conf. Mining Software Repositories, 2009, pp. 11-22. Vancouver, BC, Canada
[http://dx.doi.org/10.1109/MSR.2009.5069476]
[15]
C. Bird, P.C. Rigby, E.T. Barr, D.J. Hamilton, D.M. German, and P. Devanbu, "The Promises and Perils of Mining Git", Proc. 6th IEEE International Working Conference on Mining Software Repositories, pp. 1-10, . Vancouver, BC, Canada
[http://dx.doi.org/10.1109/MSR.2009.5069475]
[16]
E.T. Barr, C. Bird, P.C. Rigby, A. Hindle, D.M. German, and D. Premkumar, "Cohesive and Isolated Development with Branches", Proc. 15th International Conference on Fundamental Approaches to Software Engineering, pp. 316-331, . Tallinn, Estonia
[17]
M. Sipos, J. Heide, D.E. Lucani, M.V. Pedersen, F.H.P. Fitzek, and H. Charaf, "Adaptive Network Coded Clouds: High Speed Downloads and Cost-Effective Version Control", IEEE Trans. Cloud Comput., vol. 3, pp. 1-14, 2015.
[18]
Q. Wang, V. Cadambe, S. Jaggi, M. Schwartz, and M. Médard, "File Updates Under Random/Arbitrary Insertions and Deletions", Proc. IEEE Information Theory Workshop (ITW), pp. 1-5, .
[http://dx.doi.org/10.1109/ITW.2015.7133118]
[19]
T. De Nies, S. Magliacane, R. Verborgh, S. Coppens, P. Groth, E. Mannens, and R. Van de Walle, "Git2PROV: Exposing Version Control System Content as W3C PROV", In: Proc. 12th Int. Semantic Web Conf., 2013, pp. 1-4.
[20]
L.F. Cortes, M. Linares-Vásquez, and J. Aponte, "On Automatically Generating Commit Messages via Summarization of Source Code Changes", In: 14th IEEE International Working Conference on Source Code Analysis and Manipulation, 2014, pp. 275-284.
[http://dx.doi.org/10.1109/SCAM.2014.14]
[21]
S. Rastkar, and G.C. Murphy, "Why did this code change?", In: Proc. 35th Int. Conf. on Software Engineering ICSE’13, 2013, pp. 1193-1196.
[22]
T. Yeh, and T. Chien, "Building a Version Control System in the Hadoop HDFS", In: Proc. IEEE/IFIP Network Operations and Management Symposium, 2018, pp. 23-27. Taipei, Taiwan
[http://dx.doi.org/10.1109/NOMS.2018.8406190]
[23]
M. Armbrust, R.S. Xin, C. Lian, Y. Huai, D. Liu, J.K. Bradley, X. Meng, T. Kaftan, M.J. Franklin, and A. Ghodsi, "Spark sql: Relational data processing in spark", In: Proc. ACM SIGMOD International Conference on Management of Data, 2015, pp. 1383-1394.
[http://dx.doi.org/10.1145/2723372.2742797]
[24]
H. Li, A. Ghodsi, M. Zaharia, S. Shenker, and I. Stoica, "Tachyon: Reliable, memory speed storage for cluster computing frameworks", In: Proc. ACM Symposium on Cloud Computing, 2014, pp. 1-15.
[http://dx.doi.org/10.1145/2670979.2670985]
[25]
O. Rodeh, J. Bacik, and C. Mason, "Btrfs: The linux btree filesystem", ACM Trans. Storage, vol. 9, no. 3, p. 9, 2013. [TOS].
[http://dx.doi.org/10.1145/2501620.2501623]
[26]
S. Jajodia, and L. Strous, "Integrity and Internal Control in Information Systems VI: IFIP TC11/WG11.5", Sixth Working Conference on Integrity and Internal Control in Information Systems (IICIS), 2003
[27]
G. Blokdyk, Distributed Version Control System Dvcs a Clear and Concise Reference, 5starcooks.
[28]
J. Bendík, and N. Beneš, " Černá, "Finding Regressions in Projects under Version Control Systems", Proceedings of the 13th Inter. Conf. on Software Technologies,",
[http://dx.doi.org/10.5220/0006864401860197]
[29]
J. Loeliger, Version Control with Git: Powerful tools and techniques for collaborative software development., O'Reilly Media: Sebastopol, CA, 2009.