Home Picture

Git

13 Jan 2021 |

Categories: Data-science

Git

What is Git?

According to the official documentation, Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.

When we work on a project, we often need to have version control. And, Git is a tool that can help on version control.


How does Git work?

First, Git handles content in snapshots, one for each commit, and knows how to apply or roll back the change sets between two snapshots. This is an important concept. Understanding the concept of applying and rolling back change sets makes Git much easier to understand and work with.

Process of creating a change from branch A to branch B

Naming

Snapshots, changes, are the main elements in Git. They are named with the commit ID, which is a hash ID like “6e3cd7b0261a299d84e867d7bd765d6df3435ba7” for example. It comprises of the actual content and some metadata like time of submission, author information, parents, etc. Git takes the minimum number of characters from the start of the ID as its commit ID. “6e3cd7b” is the example here.

A branch in Git is only a named pointer to a specific snapshot. It notes the place where new changes should be applied to when this branch is used.

Branching

The concept behind branching is that each snapshot can have more than one child. Applying a second change set to the same snapshot creates a new, separate stream of development.

Example branch structure in Git

As the figure shown above, there are three branches, master, old and feature, in the Repo. Each branch points to different snapshot. HEAD is the pointer that points to the current working branch.

Merging

When a new feature development is finished, it needs to be merged back into the master branch. To do so, the master branch should be checking out with a git merge <branch name> command. Git merges the changes from the given branch into the checked out branch. What Git does is to apply all of the change sets from the feature branch onto the tip of the master branch.

Depending on the type of changes in the two branches, and possible conflicts, there are three possibilities that can happen.

Fast forward Merge

The receiving branch, master (by checking out master as the first step), did not get any changes since the two branches are diverging. The receiving branch still points to the last commit before the other branch, feature. In this case, Git moves the branch pointer of the receiving branch, master, forward to the last snapshot in the other branch, feature. Because there is nothing to do besides moving the branch pointer forward, Git calls this a fast forward merge.

Fast-forward merge

No-conflict Merge

There are changes in both branches, master and feature, but they do not have conflict. Tthe changes in both branches affect different files. Git can automatically apply all changes from the other branch into the receiving branch, master, and create a new commit with these changes included. The receiving branch is then moved forward to that commit

No-conflict merge

Conflicting Merge

There are changes in both branches, but they conflict. In this case, the conflicting result is left in the working directory for the user to fix and commit, or to abort the merge with git merge –abort command.

To deeply understand how to resolve conflict in action, check Sovling Conflict


Apart from merging, sometimes, we develop a feature, but the master development also heads on in parallel. At that time, we do not want to merge the developing feature with master branch just yet. The consequence would be that the two branches, master and feature, move away from each other quite quickly. However, it is possible to apply change sets from one branch to another. Git offers the rebase and the cherry-picking feature for that.

Rebasing

Normally, we are developing a feature and need to incorporate the latest changes from the master branch to keep up with general development. This is so called rebasing feature branch. It moves the diversion point between the two branches up on one of the branches. Git puts the oldest snapshot from one branch on top of the tip of the other branch, creating new commits for each of the original commits.

Rebasing a branch

Cherry picking

Usually, we are working on a feature, and have developed some change that should be put into the master development immediately. This could be a bug fix, or a cool feature but you don’t want to merge or rebase the branches yet. Git allows to copy a change set from one branch to another by using the cherry pick feature.

Cherry picking a commit

Revert

The revert command rolls back one or more patch sets on the working directory, then creates a new commit on the result. revert is almost the reverse of a cherry-pick.

Reverting a commit


Workflow

Here is a list of common workflow that we will use as a developer:

To deeply understand the workflow in action, check Common workflow


Reference:

git
Learn the workings of Git

Top