3 min read

Make Subfolder A Git Submodule

Make Subfolder A Git Submodule

Ever been in a situation where a sub-folder of your git repository needs to branch out as a new repository? Here in this article, I’ve tried a new way using a python module to simplify the process. This step is also recommended by the Git core team as ways to move a sub-folder to a new clean repository (link).


We’re just clones, sir. We’re meant to be expendable.

— Sinker.

Come on, let’s jump in! 🚀

Prerequisites

First of all, you must have a Git1 on your machine. Second, must have existing test git repository and Python 32 installed.

If you don’t have Git yet, you can install git from its official sources, its available on all platforms even on android. Or if you have Visual Studio3 installed, just locate it from your drive. Python can also be installed using the Visual Studio installer.

So where do we start?

This will be our initial test repository structure:

+-+ root
  |
  +-+ test-repository
  | |
  | +-+ desired-directory
  |   |
  |   +-+ contents
  | +-+ other-directory

The first step you need to do is clone the test repository by either copying it by cp command or by creating a duplicate cloned copy using git clone.

+-+ root
  |
  +-+ test-repository
  | |
  | +-+ desired-directory
  |   |
  |   +-+ contents
  | +-+ other-directory
  +-+ test-repository-copy
  | |
  | +-+ desired-directory
  |   |
  |   +-+ contents
  | +-+ other-directory

Then install this python module named git-filter-repo. Install the module using the pip utility.

pip3 install git-filter-repo

This git-filter-repo simplifies the process of filtering files, directories and history. This tool as said on its Github page falls on the same category as git-filter-branch. You can check its Github repository page for pros and cons against similar tools.

Next thing we do is go into the cloned test repository and filter the directory you want (in our case its the desired-directory) to separate into a new repository.

cd test-repository-copy
git filter-repo --path desired-directory --subdirectory-filter desired-directory

This will modify the cloned directory history and delete existing content that does not match the subdirectory filter. The new structure of the directory will be like this:

+-+ root
  |
  +-+ test-repository
  | |
  | +-+ desired-directory
  |   |
  |   +-+ contents
  | +-+ other-directory
  +-+ desired-directory
  | |
  | +-+ contents

The desired-directory will now become its own repository retaining the history of files that are inside.

After moving the sub-folder to its own repository, we go back to our original test repository and delete the filtered directory.

cd test-repository
git rm -rf desired-directory

Still on the test-repository, create a new git submodule and link the filtered directory repository.

git submodule add ../desired-directory desired-directory

That’s all the steps needed, check if everything is working. Check the quick review below for a summarized setup.

Quick Review

Here are the simplified steps based on the above:

  1. Make a copy or clone the current project where the sub-folder is located.
  2. Install git-filter-repo using the command pip3 install git-filter-repo.
  3. On the cloned project folder, filter it base on the directory you want to make new repository with the command git filter-repo --path <new-path> --subdirectory-filter <filtered-directory>
  4. Go to your current project folder and delete the sub-folder using the command git rm -rf <filtered-directory>.
  5. On the current project create the sub-module using git submodule add <new-repo-url> <filtered-directory>.
  6. Check if everything is okay.

That’s all guys, always make a backup of your repository before proceeding. Anyways it’s on git version control system – you can go back and re-fix if there is something wrong.

Conclusion

There are many answers in the internet regarding this matter, but mostly they don’t explain what will occur when you run this command, this one I’ve personally tried it as before I was using a mono repository setup. But it became so large that it’s hard to maintain especially on cases of testing and checking the history.

Let me know in the comments if you have questions or queries, you can also DM me directly.

Follow me for similar article, tips, and tricks ❤.


  1. Git (/ɡɪt/) is a distributed version-control system for tracking changes in any set of files, originally designed for coordinating work among programmers cooperating on source code during software development. ↩︎
  2. Python is an interpreted, high-level and general-purpose programming language. Python’s design philosophy emphasizes code readability with its notable use of significant whitespace. Its language constructs and object-oriented approach aim to help programmers write clear, logical code for small and large-scale projects. ↩︎
  3. Microsoft Visual Studio is an integrated development environment (IDE) from Microsoft. It is used to develop computer programs, as well as websites, web apps, web services and mobile apps. Visual Studio uses Microsoft software development platforms such as Windows API, Windows Forms, Windows Presentation Foundation, Windows Store and Microsoft Silverlight. It can produce both native code and managed code. ↩︎