Tim’s Software Engineering Blog

Getting Started with Git for Version Control

In this article, we're going to cover the basics of version control, specifically using the software Git, which was created in 2005 by the software engineer Linus Torvalds. Git is the leader among version control systems, with almost 94% of developers utilizing Git, according to StackOverflow's 2022 developer survey. While Git may sometimes seem intimidating and some of the concepts overwhelming, my aim with this article is to break down the tool and its commands into easy-to-understand concepts and have you leave with a solid, practical foundation to utilize and build upon.

Table of Contents

What is Version Control?

To start, it is helpful to first define what "version control" even is in software development. While Git is the most popular leader for version control systems, it is far from the only one out there. Different version control systems can range from full-featured software (like Git, SVN, Mercurial, TFVC, etc.) to something as simple as creating copies of your files with tags like _v1, _v2, etc.

At its core, "version control" is just a term for managing changes to files over time, and this can be achieved through a variety of methods. However, certain methods, such as the manual, ad hoc version control previously described, do not scale well, are error-prone, and are extremely difficult to manage when working with multiple developers. This is where Git comes in and where its sophistication shines.

What is Git?

Git is software that allows you to manage version control by giving you powerful and reliable tools that help automate, formalize, and make versioning safe and reliable, while also enabling collaboration among developers. Because Git is an industry standard for version control, you are extremely likely to encounter organizations that use Git in their day-to-day development work, so having a solid foundation is crucial. In addition, Git is widely integrated into tools like GitHub, Bitbucket, Azure DevOps.

Git is what is known as a "distributed version control system", which means that every developer who is working on a particular repository has a full copy of the project's history, not just the latest version of their changes. This has major benefits over a centralized version control system (such as a file server), as you can easily work offline if needed, and your repository is more resilient to a single point of failure. Having a full copy of the repository locally also means that you can make changes quickly, with commands that are fast and efficient.

Installing Git and Setting It Up

First, let's check to see if you already have Git installed. Simply open a terminal instance and enter the following command

% git --version

Note that in all shell commands in this article, lines beginning with % signify something you enter into the terminal (minus the %), while lines beginning with > represent output from the terminal

If you get a response like what I have shown below, then you have Git installed and are ready to go. Note that depending on your system, you may see something different here, but as long as you have a relatively recent version of Git installed, you will be fine.

> git version 2.51.0

If you get a response saying something along the lines of "command not found", then you will need to install Git onto your computer. The Git downloads page provides links to download the Git client for your specific system and will provide a guide to walk you through the install process.

If you already have Git installed, but an older version, you can update it with the following commands, depending on your system.

# If you have a macOS and have homebrew
% brew install git

# If you are on Windows, try the `update-git-for-windows` command or check the official installer
% git update-git-for-windows

If these commands do not work, you should start by checking the answers on these StackOverflow links (macOS, Windows).

Getting Started with a Repository

The first thing that we'll do with Git is to initialize a repository. A repository (or "repo") is essentially a folder structure where Git will track the changes to any files or subdirectories that are inside of it. This is a different workflow than "cloning" a repository, which is when you create a new instance of an existing repository, usually with the full history.

To initialize the repository, we'll just need to run the git init command from the terminal in the current directory, optionally providing the name of our initial branch in the repository. Here I'll explicitly specify main as the initial branch. If this optional parameter is not provided, Git will initialize the repository with the default branch name, main. If you're using an older version of Git, the default branch name may be master. You can modify this default branch with a single-line command, % git config --global init.defaultBranch main

Note that I have run this in a directory that I had already created, /Users/tim/Projects/writing/test-repo, the git init[docs] command will initialize the repository in the current directory you are in.

% git init --initial-branch=main
> Initialized empty Git repository in /Users/tim/Projects/writing/test-repo/.git/

You can see from the response that the repo was initialized, and if you examine the files in the directory, you will see the .git directory that was created.

% ls -la
> .
> ..
> .git

If you would like to see a more interesting confirmation that your repo was created, you can also run git status to see the output message that is displayed

% git status
> On branch main
> 
> No commits yet
> nothing to commit (create/copy files and use "git add" to track)

Now that we've covered initializing a repository, let's move on to discussing some of the common steps in the Git workflow.

The Git Workflow (Core Concepts)

When working with Git, it is important to understand how Git tracks changes for files and the workflow for adding, updating, and removing files from the repository. Understanding how Git works helps you make better decisions when developing and you can then help others learn as well.

At the risk of oversimplifying, Git tracks changes to files in a way that avoids duplication. This means that if you have a 1,000 line file and only make a change on one line, Git doesn't save a full second copy of that 1,000 line file. Rather, it reuses the 999 lines that haven't been changed and adds the one line of change. This helps keep version history in Git very lightweight and efficient. Contrast this with a form of version control I discussed at the beginning of the article, where you make copies of files as you are editing and modify the application_v1.py or application_v2.py tag. To maintain full version history with that approach, you would have to store thousands of lines of redundant code, while Git efficiently re-uses unchanged content and stores only what is new. At scale, Git's approach to version history saves significant space, while still making it easy to track what has changed.

Conceptually, you can think of a Git repo as having three areas for your files, a working directory, a staging area, and the committed repository. Let's take a look at each of these in turn.

A simplified diagram of the workflow is shown below, with each box representing one part of the workflow, from the working directory, to the staging area, to the repository, then optionally to a remote repository.

+-----------------+ git add +-----------------+ git commit +-----------------+
|Working Directory| ------> |  Staging Area   | ---------> |   Repository    |
| (your project   |         | (snapshot prep) |            |(version history)|
| files & changes)|         |                 |            |                 |
+-----------------+         +-----------------+            +-----------------+
                                                                    |
                                                                    | git push
                                                                    v
                                                           +-----------------+
                                                           |  Remote Repo    |
                                                           |  (e.g., GitHub) |
                                                           +-----------------+

Basic Git Commands You’ll Use Every Day

We've already covered the general concepts of the Git workflow, but now we'll dive into the basic Git commands that you'll likely find yourself using on a daily basis. This list is far from exhaustive, but knowing these basic commands (along with some of their different parameters) can take you a long way.

git status[docs]

The git status command shows you the current state of your repository. This includes any files in your working directory that are not currently tracked as a part of the repo, any files that are staged for commit, and any commits that are ready to be pushed to a remote repository (if configured).

git fetch[docs]

git fetch is used to determine what references (commits, branches, etc.) a linked remote repository might contain that your local copy of the repository does not. Running git fetch does not actually update your repository, but determines what could be updated.

git pull[docs]

The git pull command will actually retrieve the changes, if any, that a linked remote repository contains and integrate them into your copy of the repository. Behind the scenes, the git pull command also runs git fetch. A way to think of the difference between git fetch and git pull is to think of git fetch as checking to see if there are any items that are in your mailbox, while git pull is the process of actually retrieving the items that are there.

git add[docs]

The git add command will take files in your working directory and move them into the staging area. With git add, you can specify either a specific file or a path to a directory to add multiple files to the staging area at once, for example git add path/to/file.py. We'll see this particular command in action in our section where we cover a common Git workflow.

git commit[docs]

git commit is what you will run to actually commit your stage file contents to the repository itself. There are a number of different flags that can be used when running the command, but probably the most frequently utilized is -m, which allows you to specify a commit message for the particular file changes. Providing a descriptive, but concise, commit message is extremely important for the developers or maintainers who will be utilizing your code after you. A message like "added new .restore() method to Task class" is a helpful message, "changes to files" is not particularly helpful.

# Sample command syntax
git commit -m "A helpful, but concise commit message"
git push[docs]

The git push command is used to synchronize any new changes that you have made in your local copy of the repository with the "remote" copy of the repository. Often, this is used to synchronize your changes with a copy of the repo that is in GitHub or another similar hosting service.

git log[docs]

The git log command provides an easy way for you to see the recent commits and messages to your repository. git log includes a significant number of formatting options that can be used to modify the display of the commit log, but the most basic options would be to run git log without any parameters to display the full log, or passing a numeric flag (such as git log -5) to show the most recent N number of commits.

git branch[docs]

While we are not covering different branching strategies in this article, the git branch command can be used to create a new branch, delete an existing branch, or list all branches within a repository. Branches allow you to work on different strands of development in isolation, without affecting the work that is happening on any other branch.

git switch[docs]

The git switch command, as the name might suggest, is used to switch between different branches of a repository. When you switch between branches in a repository, the working directory is updated to match the state of the repository for that branch. As a side note, you may encounter tutorials that make use of git checkout, which can be used to switch branches in the same way. git switch however, was introduced more recently to help bring clarity to some of the uses of git checkout. Let's cover a quick example of using git branch and git switch to create a new branch, switch between branches in our repo, commit a change, and then merge it back into our main branch.

% git branch new-feature
% git switch new-feature
> Switched to branch 'new-feature'
% echo 'print("Hello, again")' >> hello_repository.py
% git add hello_repository.py
% git commit -m "Added another hello message after the previous goodbye."
% git switch main
> Switched to branch 'main'
% git merge new-feature
> Updating 2e5e5d6..5c3f274
> Fast-forward
>  hello_repository.py | 1 +
>  1 file changed, 1 insertion(+)
% git branch -d new-feature
> Deleted branch new-feature (was 5c3f274).
git clone[docs]

The git clone command will create a clone (effectively, a copy) of an existing Git repo, by default in a new directory that matches the name of the repository. For example, if we had our test-repo repository hosted in GitHub, the command to clone it might look something like the following

% git clone https://github.com/<username>/test-repo.git

This would create a test-repo directory, along with the necessary files to track all version history for the repository.

Common Workflow Example

Let's go over a common scenario with a Git repository to show how some of these commands actually work in practice. We'll use the same repository that we created earlier to demonstrate. Before we get into creating and committing a file though, we want to tell Git who we are, which we can do through the git config command. The command flag --global sets the user email and name at a system level for all repositories, you can omit this if you want to just set the name and email for this repository only.

% git config --global user.email "your-email@example.com"
% git config --global user.name "Your Name"

Having told Git who we are, let's create a new file called hello_repository.py in the directory of your Git repo.

% touch hello_repository.py

Let's add a basic print statement to the file and then save it.

# `hello_repository.py` contents
print("Hello, repository!")

After you have saved the file, in your terminal instance, make sure you are in the same directory as the Git repo and then run git status to see the output. You should see something very similar to what is shown below.

% git status
> On branch main
>
> No commits yet
>
> Untracked files:
>   (use "git add <file>..." to include in what will be committed)
>         hello_repository.py
>
> nothing added to commit but untracked files present (use "git add" to track)

Next, let's explicitly add our file to the staging area with git add. You won't see any output from this command, but after, run git status again to see how the output has changed from before.

% git add hello_repository.py
% git status
> On branch main
> 
> No commits yet
> 
> Changes to be committed:
>   (use "git rm --cached <file>..." to unstage)
>         new file:   hello_repository.py

Notice how the hello_repository.py file is now listed under "Changes to be committed", whereas before it was listed under "Untracked files"? This indicates that the file has been staged and is ready to be committed to the repository. Let's do just that by running git commit and adding a good commit message.

% git commit -m "Adding a new hello_repository.py file to test our Git workflow"
> [main (root-commit) a165994] Adding a new hello_repository.py file to test our Git workflow
>  1 file changed, 1 insertion(+)
>  create mode 100644 hello_repository.py

If we run git log (remember, this shows us recent commits), we will see output very similar to the following.

% git log -1
> commit a165994eedc9ed6010ffbe5d0fcc9ed244f2aa22 (**HEAD** -> **main**)
> Author: Your Name <your-name@example.com>
> Date:   Thu Aug 21 22:03:59 2025 -0700
> 
>     Adding a new hello_repository.py file to test our Git workflow

Congratulations! You've now created a file, staged it, and committed it to your repo.

Let's try creating another change to your file, stage it, and commit it. You can add another print statement

# `hello_repository.py` contents
print("Hello, repository!")
# A new statement, but something seems off...
print("Now goodby!")

Let's run the git status command to see what the current state of the repository is and examine the differences between this time and our previous invocation of git status

% git status
> On branch main
> Changes not staged for commit:
>  (use "git add <file>..." to update what will be committed)
>  (use "git restore <file>..." to discard changes in working directory)
>       modified:   hello_repository.py
> 
> no changes added to commit (use "git add" and/or "git commit -a")

Notice how this time, because we have already added the hello_repository.py file to our repo, the git status command shows modified:   hello_repository.py instead of listing hello_repository.py under an untracked file. Let's stage our file with git add, check the status of the repository, and then commit the updates with git commit.

% git add hello_repository.py
% git status
> On branch main
> Changes to be committed:
>  (use "git restore --staged <file>..." to unstage)
>        modified:   hello_repository.py
% git commit -m "Added another print statement to demonstrate file updates"

But wait a minute! What happens when I realize that I have made a typo in my code in my file that I have already committed to the repo? As we saw, I introduced a typo with my newly added print statement. Fortunately, Git provides a way to easily roll back a commit in your local repository, using the git reset[docs] command. Let's explore the syntax in a bit more detail.

To undo our last commit, we'll use git reset --soft HEAD~1. What this effectively does is to rewind the repository to its state right before we just ran our last commit. The --soft command line flag tells the Git reset command that you want to undo the last commit, but you don't want to undo the changes to the file(s) that were a part of the commit. The HEAD~1 portion of the command tells Git that you want to rewind your current position in the repository one commit (HEAD~2 would be two commits, etc.). We can see this in action by executing the following sequence of commands and examining our repository state at each step.

% git status
> On branch main
> nothing to commit, working tree clean
% git reset --soft HEAD~1
% git status
> On branch main
> Changes to be committed:
>   (use "git restore --staged <file>..." to unstage)
>        modified:   hello_repository.py

Note that if you are working with a remote repository like GitHub and have already pushed your commits, rolling back commits becomes much more challenging. For this reason, regardless of if you are working by yourself in a local repository or with others, you want to ensure that you are mindful about what you commit to your repository.

Feel free to explore your test repository and make some more commits to get familiar with the syntax and mechanics of working with a Git repo!

Tips and Best Practices

While many developers have their own personal preferences and practices for using Git, there are some practices that I believe are foundational to any Git workflow, regardless of the developer or organization.

Commit Often

This practice is extremely important when working with other developers, who may be simultaneously making changes to other files, or even the same one that you are working on. You do not want to be in a position where you have worked on a file for multiple days, made numerous changes, and go to commit and integrate your changes into the remote repository, only to find that there are merge conflicts. While it is sometimes unavoidable to have conflicts, these can be minimized by frequent integrations with other developers.

Use Descriptive Commit Messages

Descriptive commit messages leave a helpful trail of comments that can be used by anyone else who is using your repository to understand what has happened over a particular period of time. Think about the last time you read code that you wrote more than six months ago, would it be helpful to have a series of messages that explain what changes you made and why? The answer is probably yes.

Run git status often

The git status command tells you what the current state of your repository is and is an invaluable tool for helping you figure out where to go next. I have personally been using Git for years and run git status all the time in between commands to make sure I didn't inadvertently make an undesired change, or whenever I come back to the terminal after performing some other task.

Don't Commit Secrets or Keys

When working with a repository, special care needs to be taken to ensure that secrets or keys are not exposed in the files that are committed. Many people have found themselves in a scenario where they commit a file, only to realize later that they left an API key or a password in the file. If you are fortunate, you will realize it before anyone else does and can revoke the key or change the password, but the better path is to just avoid this scenario in the first place. My recommendation is to have any key or password stored in an environment variable that is read by the application, or in a configuration file that is explicitly ignored by the .gitignore file.

Use a .gitignore file

The .gitignore file is used to exclude certain files or directories from a repository and you will often (but not always) see these files at the root level of a repository. For example, if you're using Python, you should add the directory of your virtual environment to the .gitignore file to avoid including unnecessary files in the repo and bloating the size. As an example, the .gitignore file below excludes the .venv file or directory and a .env file or directory.

.venv
.env

The .gitignore documentation includes more details on what patterns can be used to specify files or directories that are excluded from a repository. Additionally, GitHub provides a large number of .gitignore templates that you can use for a project, there is likely already one for the language you are using to get you started.

Where to Go Next

Having covered the basics of using Git, the best thing to do next is to use it! Start a project on GitHub, use it to track code or notes for a class, or see if you can integrate Git into a workflow in your day-to-day work. Exploring the documentation for a particular command can also help you understand more about the options that are available to increase its flexibility and usefulness. I would also recommend checking out some of the sources I've linked below to get more details on Git and its usage.

Sources

#git #software-engineering