Graphical Git add\commit implementation principle

Graphical Git add\commit implementation principle

There is not much to say about Git's science here, Baidu has a lot of search

The following is mainly to explore the changes in the contents of files in the .git directory behind the git operation (add and commit)

PS: There are three main storage objects in the git database (tree\commit\blob), which are stored in the .git/objects directory. We will introduce them later and view the git database files mainly with the following two commands: git cat-file -t < Key value> and git cat-file -p <key value> (-t is used to view the type of data corresponding to the key value, -p is used to view the data content corresponding to the key value)

Finally, there is the git ls-files --stage command (see the contents of the staging area file)

First find a directory to initialize a git warehouse (F://GIT is selected here)

After the execution is completed, there is an additional .git folder in the F://GIT directory

Now .git/objects looks like this (info and pack saves the git data files trimmed after the git gc command is executed, we will not discuss it here, we will talk about it when we have time)


The add command will put the file data in the temporary storage area (concept Baidu). In fact, this temporary storage area is just a file (.git/index). No operation has been performed in this git repository yet, so there is no such file.

Step into the topic

  1. First create a demo.txt file in the F://GIT directory, the content inside is add (write it casually), and then perform an add operation (omitted)
  2. Now the .git/index file has been generated, if you open it directly, you will find it is garbled, but Git provides a command to view the contents of this file (git ls-files --stage)
  3. This is the content of the temporary storage area, d28d40b18823a27071d0e9ce89c149adb3f9c4ee is a hash value (key value) that points to a specific file under .git/objects; the last demo.txt is the path of the object file for this add operation
  4. Look at the changes under .git/objects. There is an extra d2 folder (in fact, the first two characters of the above hash value), a file under the d2 folder, and the file name is the character after the hash value (8d40b18823a27071d0e9ce89c149adb3f9c4ee)
  5. Use git cat-file -t <key value> to look at the file type (the key value is abbreviated) and find that it is the blob storage object mentioned above
  6. Then use git cat-file -p <key value> to look at the content of the file. Oh, it is the content in demo.txt. The original blob storage object saves the content of the file.
From the above content, we can see that add generates blob storage objects, and then take a look at commit

PS: Index file content rule: The same file is overwritten (the latest is always saved), and different files are added. If you cancel the current add, only the corresponding file information will be canceled. The previous information still exists, and you can do it yourself if you are interested. try


  1. Perform the commit operation on the basis of the previous add (message is version1)
  2. Take a look at the changes under .git/objects, and two more files (the tree and commit storage objects we will talk about next)
  3. First look at the contents of the first folder (same as the blob storage object, the folder name and file name together are the key value corresponding to the storage object)
  4. Follow the same pattern, look at the file type and content, and see the result that the file corresponds to the tree storage object, and the content records the key value and file path of the blob storage object generated this time (demo.txt)
  5. Finally, take a look at the type and content of another file, and see the result that the file corresponds to the commit storage object, the content records the key value of the corresponding tree object, the submitter information and time, and the corresponding commit message (version1) should actually be There is a parent message (the content is the key value of the last commit), because we only made a commit once, so we didn t see it
At this point, git continues to correspond to the context of the version information.

relation chart

Finally, let s talk about the git index. If there is no index, you have to read so many commit storage object files and look up....Slow personally

There is an index corresponding to the current branch of the retrieval file (.git\logs\HEAD) in the .git directory, and the contents stored in it are as follows

Basic summary

The core of Git is its object database, the most important of which are commit, tree and blob objects

commit: records the hash key value of the corresponding tree object, the hash key value corresponding to the last commit, version author, version sequence, version description, submission time and other additional information

tree: records the file name and file directory structure of the corresponding version

blob: a record of the contents of a file

Git finds the corresponding file by key value, which makes it easier to locate a certain version of the submission. Functions such as Git branches and tags are all based on it.