The git version control system#

credit: xkcd

Version control systems are systems to manage changes to documents, computer programs, large web sites, and other collections of information.

The main idea of version control systems#

  • A version controlled system (typically) contains one official repository.

  • Contributors work on copies of repository files and upload the changes to the official repository.

  • Conflicts might occur if two people work on the same file simultaneously.

    • Non-conflicting modifications are merged automatically.

    • Conflicting modifications must be resolved manually.

Use cases for version control systems#

Organization

  • Retrieve old versions of files.

  • Print history of changes.

Collaboration

  • Share code between people and work simultanously on the same codebase

  • Track changes and quickly undo changes if necessary

Backups

  • Store copy of git repository on an external platform e.g. github

Git: the current standard for version control#

  • git is a fast, desentralized, and open-source version control system

  • Many sites for storing git repositories online (e.g. GitHub and Bitbucket).

  • Installation instructions: https://git-scm.com. On Debian derivates (e.g. Ubuntu):

sudo apt-get install git
  • Recommended book Pro Git (free to download here)

(The rest of the lecture uses material from this book)
# (clear everything before we start)
rm -rf ~/in3110/mysrc

Creating your first git repository#

  • A git repository is a folder in which files can be tracked by git.

  • A git repository is created with:

mkdir -p ~/in3110/mysrc
cd ~/in3110/mysrc
git init .  # The src folder is now also a git repository
Initialized empty Git repository in /Users/minrk/in3110/mysrc/.git/
  • Git created a (hidden) directory .git in that folder which will contain all history information.

Adding files to the repository#

  • By default, git does not track any files.

  • Files need to be added to the repository in order to track their changes:

echo "print(\"Hello\")" > myfile.py  # Create a new file myfile.py 
git add myfile.py
ls
myfile.py

Create a snapshot of the repository by committing the added file:

git commit -m 'Initial version of myfile.py'
[main (root-commit) 9e34d87] Initial version of myfile.py
 1 file changed, 1 insertion(+)
 create mode 100644 myfile.py

The lifecycle of the status of your files#

  • Files in your repository can either be tracked or untracked.

  • Untracked files are always left untouched by git.

  • Tracked files can be

    • unmodified (no changes since last commit)

    • modified (changes since last commit)

    • staged (changes are ready to commit)

  • This figure shows the full lifecycle:

* The `git status` command shows in which status files are.

Inpsecting the changes made since the last commit#

Let’s first make some changes

echo "print(\"World\")" >> myfile.py 
echo "This is a simple hello world project." > README.md

We use git status to see the current state of the repo:

git status
On branch main
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
	modified:   myfile.py

Untracked files:
  (use "git add <file>..." to include in what will be committed)
	README.md

no changes added to commit (use "git add" and/or "git commit -a")

Line-by-line changes since the last commit can be displayed with git diff

git diff
diff --git a/myfile.py b/myfile.py
index 2f9a147..3d3d93b 100644
--- a/myfile.py
+++ b/myfile.py
@@ -1 +1,2 @@
 print("Hello")
+print("World")

Creating another commit#

Let’s stage all changes with git add:

git add README.md myfile.py
git status
On branch main
Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
	new file:   README.md
	modified:   myfile.py

If we are satisfied, we create a snapshot of the repo with git commit:

git commit -m 'New README.md file and fix in myfile.py'
[main fe6fe87] New README.md file and fix in myfile.py
 2 files changed, 2 insertions(+)
 create mode 100644 README.md

Viewing the history of commits#

  • For every commit, git creates a snapshot of all tracked files in the repository.

  • Each commit is identified by unique hash key

  • git log can be used to view the commits in a repository:

git log
commit fe6fe87b35c5692371df567a0bad56f259996213 (HEAD -> main)
Author: Min RK <benjaminrk@gmail.com>
Date:   Wed Aug 23 13:19:07 2023 +0200

    New README.md file and fix in myfile.py

commit 9e34d877368ad70d1da9235f9abc15bb27bd8fe7
Author: Min RK <benjaminrk@gmail.com>
Date:   Wed Aug 23 13:15:32 2023 +0200

    Initial version of myfile.py
  • Git allows us to view older version of the repository

    • But how do we know which version we are currently at?

The role of the HEAD pointer#

  • HEAD is a special pointer that shows where you currently are in the repository history.

* Running `git commit` updates the `HEAD` pointer to that latest commit.
git log --oneline
fe6fe87 (HEAD -> main) New README.md file and fix in myfile.py
9e34d87 Initial version of myfile.py

Some usefull command line arguments for git log:

  • --oneline: summarize each commit as one line

  • git log FILENAME: show commits affecting one file or directory

Back to the future: Getting older revisions of a repository#

  • To go to a previous snapshot of the repository:

    • Simply move the HEAD pointer to that commit.

    • All tracked files will automatically be updated to the version in that commit.

  • The command for moving the HEAD pointer is git checkout:

git log --oneline
fe6fe87 (HEAD -> main) New README.md file and fix in myfile.py
9e34d87 Initial version of myfile.py

Let’s revert to the first commit:

git checkout main^1
Note: switching to 'main^1'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

HEAD is now at 9e34d87 Initial version of myfile.py
git log --oneline --all
fe6fe87 (main) New README.md file and fix in myfile.py
9e34d87 (HEAD) Initial version of myfile.py

The README.md has disappeared and we have the initial version of myfile.py back:

ls
myfile.py
cat myfile.py
print("Hello")

To move back to the latest version, we use:

git checkout main  # alternatively use the identifier of the latest commit
Previous HEAD position was 9e34d87 Initial version of myfile.py
Switched to branch 'main'
ls
README.md myfile.py
git log --oneline --all
fe6fe87 (HEAD -> main) New README.md file and fix in myfile.py
9e34d87 Initial version of myfile.py

Removing and moving files#

Files can be removed from the repository with

$ git rm myfile.py

and moved with

$ git mv myfile.py file.py

Tagging#

  • Git has the ability to tag specific commits (i.e. give them a more memorable name than the identifier).

  • Typically used to mark release points of your software

git cheat sheet (part 1)#

  • git init .: create a new (local) repository

  • git status: View status of commited/uncommited files

  • git commit -a: create a commit of all tracked files

  • git rm FILE: remove a file

  • git mv FILE: move/rename a file

Remote repositories#

We can work on git repositories that live on a remote location (for collaboration or backup).

Let’s say we created a git repository on github.com: minrk/mytest

Working with remote repositories#

Clone a remote repository to a local directory:

rm -rf ~/in3110/mytest
cd ~/in3110
git clone git@github.com:minrk/mytest.git mytest
cd mytest
Cloning into 'mytest'...
remote: Enumerating objects: 9, done.
remote: Counting objects: 100% (9/9), done.
remote: Compressing objects: 100% (5/5), done.
Receiving objects: 100% (9/9), done.
remote: Total 9 (delta 0), reused 6 (delta 0), pack-reused 0
ls
README.md main.py

Create a new commit and push it to the remote repository (requires write permission on the remote repository).

echo "print('$(date)')" > main.py
cat main.py
git add main.py
git commit -m "Add main.py file"
ls
print('Wed Aug 23 13:28:10 CEST 2023')
[main 4d6ce84] Add main.py file
 1 file changed, 1 insertion(+), 1 deletion(-)
README.md main.py
git log --oneline
4d6ce84 (HEAD -> main) Add main.py file
093f8c0 (origin/main, origin/HEAD) Add main.py file
c869fef Add main.py file
1f174d7 Create README.md
git push origin main
Enumerating objects: 5, done.
Counting objects: 100% (5/5), done.
Delta compression using up to 10 threads
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 977 bytes | 977.00 KiB/s, done.
Total 3 (delta 0), reused 0 (delta 0), pack-reused 0
To github.com:minrk/mytest.git
   093f8c0..4d6ce84  main -> main
git log --oneline
4d6ce84 (HEAD -> main, origin/main, origin/HEAD) Add main.py file
093f8c0 Add main.py file
c869fef Add main.py file
1f174d7 Create README.md

On minrk/mytest we can see the new commit has been uploaded.

uploaded

You can download updates from remote repository with

git pull origin main 
  • Conflicting changes might have been made on the local and remote repository.

  • This results merge conflicts which need to be resolved manually.

  • This will be part of your first assignment.

git cheat sheet (part 2)#

  • git clone URL: clone a (remote) repository

  • git pull origin main: update file tree from (remote) repository

  • git push origin main: push changes to remote repository

Branches#

  • Branches are lightweight copies of the main version

  • Allow fast testing of new code without touching the default version

  • Remember: HEAD is a special pointer that shows where you currently are in the repository history.

  • main (or sometimes master) is a default branch that is created when initializing a new repository.

Creating a branch#

  • Branches are created with the git branch NAME command.

cd ~/in3110/mysrc
git branch testing
  • The result is that we created a new pointer to the current commit.

  • The HEAD pointer still points to the branch main.

git log --oneline
fe6fe87 (HEAD -> main, testing) New README.md file and fix in myfile.py
9e34d87 Initial version of myfile.py

Switching to the new branch#

  • We use the git switch command to move the HEAD pointer to our new branch

git switch testing
Switched to branch 'testing'
git log --oneline
fe6fe87 (HEAD -> testing, main) New README.md file and fix in myfile.py
9e34d87 Initial version of myfile.py

Creating a commit on the new branch#

Which difference does this make? Let’s make another commit:

echo "Hello world" >> testing.txt
git add testing.txt
git commit -m "Add testing.txt"
ls
[testing bf35f56] Add testing.txt
 1 file changed, 1 insertion(+)
 create mode 100644 testing.txt
README.md   myfile.py   testing.txt
git log --oneline --all
bf35f56 (HEAD -> testing) Add testing.txt
fe6fe87 (main) New README.md file and fix in myfile.py
9e34d87 Initial version of myfile.py

Switching between branches#

If we switch to the main branch again, all files will be updated to the version in main - in particular the testing.txt file will be missing:

git switch main
Switched to branch 'main'
ls
README.md myfile.py

Diverging branches#

Let’s now create another commit on the main branch:

echo "Hello world" >> main.txt
git add main.txt
git commit -m "Add main.txt"
[main 36f61eb] Add main.txt
 1 file changed, 1 insertion(+)
 create mode 100644 main.txt
git log --oneline --graph --all
* 36f61eb (HEAD -> main) Add main.txt
| * bf35f56 (testing) Add testing.txt
|/  
* fe6fe87 New README.md file and fix in myfile.py
* 9e34d87 Initial version of myfile.py

Merging branches#

We can now merge the change from the testing branch into the master branch:

git branch # Show that we are on the main branch
* main
  testing
git merge testing -m "Merge testing into main"
Merge made by the 'ort' strategy.
 testing.txt | 1 +
 1 file changed, 1 insertion(+)
 create mode 100644 testing.txt
git log --oneline --graph --all
*   ce6fcfc (HEAD -> main) Merge testing into main
|\  
| * bf35f56 (testing) Add testing.txt
* | 36f61eb Add main.txt
|/  
* fe6fe87 New README.md file and fix in myfile.py
* 9e34d87 Initial version of myfile.py

Now both the files from main and testing are in the repo:

ls
README.md   main.txt    myfile.py   testing.txt
git switch testing
Switched to branch 'testing'
ls
README.md   myfile.py   testing.txt

git cheat sheet (part 3)#

  • git branch NAME: create a new branch

  • git switch NAME: move the HEAD pointer to NAME (can be a commit identifier or branch name) (can also use git checkout)

  • git merge NAME: merges the commits of the branch with name NAME into the current branch.

That’s it for today!#

Do the interactive git tutorial on https://docs.github.com/en/get-started/quickstart/set-up-git

../../_images/try-git.png