Working on a Git repository with Submodule(s)

Working on a git sub-module is like working on any other git repository. Any git command that you perform inside a sub-module directory is executed in the context of that sub-repository. Sub-modules too can have different branches, different log histories, etc separated from their parent repository.

Editing the sub-module contents and updating its repository

Let’s look at a scenario where you want to change a tiny bit in a Sub-module. How would you go about it? You would change the current working directory to the sub-module directory. Edit a file in there, save it and change the working directory back to the main repository. Then you run git status. What you gonna see is a message that says there are un-committed changes in the sub-module:

modified: styles/module (modified content)

The work/changes you did on the sub-module were not committed. When a sub-module contains some uncommitted changes, it is considered dirty. You must make sure to always keep a clean state in your sub-modules. So we gonna have to go back and commit the work we did.

Now, change the working directory back to the sub-module:

cd path/to/submodule

Add your changes to the staging:

git add -u

commit them and push to the remote:

git commit -am "colour change"
git push origin BranchName

Another important thing to remember is that, whenever you try to commit any changes in a sub-module, you should make sure you have checked out the branch that you want to commit your work to. That’s because it is possible that the sub-module could be in the detached HEAD state. This is especially true with freshly cloned repositories. If you are in a detached HEAD situation, and you attempt to commit your work (i.e. unknowingly), your commit might easily get lost; When you are in a ‘detached HEAD’ state it means your work is not attached to any local branch and will be gone as soon as you check-out anything else. Normally in Git, you always have a certain branch checked out. However, when working with sub-modules, it is the normal state to have a certain commit (and not a branch) checked out. This is what we call a ‘Detached HEAD’ situation.

Here is a good article to learn more about ‘Detached HEAD’:

Having committed and pushed the changes of a sub-module to its remote repository, change the working directory back to your parent repository and run git status. Now you should see still there are ‘uncommitted changes’ in the sub-module, but now it is pointing to the new commit id you’ve just made in the sub-module.

modified: styles/module (new commits)

Think of the commit ID as a version. When we commit and push to a sub-module, the parent repository now points to sub-module’s latest commit id. At that point we must commit the sub-module update to the parent project’s repository:

// in main repository
git commit -am "styles/module sub-module update"

-a flag will make sure all the modifications are included in the commit in case if those are not already added to the staging.

After making a change and committing that in the sub-module, why we need to do another commit in the parent repository?

The relevant state for the sub-modules is defined by the main repository. If you commit in your main repository, the state of the sub-module is also defined by this commit. The git submodule update command sets the Git repository of the sub-module to that particular commit specified by the main repository. The sub-module repository tracks its own content which is nested into the main repository. The main repository refers to a commit of the nested sub-module repository. This means that if you pull in or make new changes into the sub-modules, you need to create a new commit in your main repository in order to track the updates of the nested sub-modules.

Pulling remote changes

I’ve identified several ways that we can bring sub-module updates to our local repository:

Method 01:

Change the working directory to the sub-module directory and run git fetch. This will download any new data from the remote repository if available. Fetch is great for getting a fresh view of all the things that had happened in a remote repository. It won’t integrate any of the new data into your working files. Due to it’s ‘harmless’ nature, you can rest assured: Fetch will never manipulate, destroy, or screw up anything.

cd /path/to/sub-module
git fetch

Then git pull can be used to bring/integrate the changes to the local sub-module repository just as usual. It will move the sub-module pointer to a different revision than the one you’ve initially checked-out. When you are in a ‘detached HEAD’ state it means your work is not attached to any local branch. If you need to pull some changes from this situation, you need to tell git on which branch you want to integrate the pulled down changes. That means you cannot use the shorthand git pull syntax but instead need to specify the remote and branch, too:

git pull origin master

Now if you execute git status, you’d notice that we are still on that same detached HEAD commit as before. The currently checked out commit was not moved like when we are on a branch. If we want to use the new sub-module code in our main project, we have to explicitly move the HEAD pointer by checking out the branch with the latest changes. See below:

git checkout master

An alternative route would be, simply cd to the sub-module directory. git fetch and checkout the branch you want to bring the changes into. Then execute git pull to integrate remote changes to the currently checked-out branch. This will move the HEAD pointer to the latest commit pulled in from the remote.

After pulling the changes from the remote, if you go back and run git sub-module status in the parent repository, you’ll see a commit hash preceded by a plus(+) sign that indicates the sub-module pointer has moved to a different revision. Now we can commit this change to the parent repository to make it official. However, in case if we want to reset the sub-module to the original commit recorded in the parent repository, we can simply do that by running the below command:

git submodule update OR git submodule update /path/to/submodule

After that, if you go back to the sub-module and run git status you’ll see that the HEAD has again gone back to its ‘Detached HEAD’ state.

Method 02:
By using git submodule update command:

Change the working directory to the parent repository and run below command:

git submodule update --remote

This will update the branch registered in the .gitmodule and by default, you will end up with a detached HEAD.

To update a particular sub-module, execute below command:

git submodule update --remote path/to/submodule

Execute below command to update all sub-modules recursively along with their tracking branches:

git submodule update --remote --recursive

There are three update strategies available:

  1. checkout
  2. merge
  3. rebase

Unless if you specify a different updating strategy, the updating will be performed as ‘checkout’ and which will leave your sub-modules in a ‘Detached HEAD’ state. After you’ve executed the above command a git status will show you there are new commits made to the sub-modules. Then you can simply commit those changes to the parent repository. This command eliminates the need to navigate into each sub-module and fetch updates manually.

The above methods are best suited for situations where we only want to update particular sub-modules but not the parent repository.

Method 03:
Updating both parent and sub-module(s) branches with their remote changes

Change the working directory to the parent repository dir, and run git pull. This will pull all the updates made to the parent repository along with any sub-module commits made to it. In this way, your local cache will be up-to-date with the sub-module’s remote, but the sub-moodule’s working directory won’t be updated. It will still be stuck to its former contents. Now if you run git status (from the parent repository), it will show a ‘new commits’ message pointing to the relevant sub-module. Then you can manually update the local sub-module repository by issuing the below command:

git submodule udpate

If you don’t take the above step, your next container commit will regress the sub-module. This is good for situations where someone in our team has updated the parent repository along with sub-module changes and pushed to the remote and later we when we want to take those updates. Here there is no need to commit anything because the commit is already there.

Article Credits / References

Was this helpful?