但我想知道的是,git 究竟做了什么来确保提交哈希值始终是唯一的,即使它们对完全相同的内容执行完全相同的操作。
没有什么。如果您创建相同的内容,您将获得相同的 SHA-1。
First, however, you need to realize that "same contents" of a commit means that—provided you don't get an accidental SHA-1 collision1 or find a way to break SHA-1—you must create the same complete repository history leading up to and including the commit itself, including all the same trees, author-names, time-stamps, and so on.
这是因为提交的内容are如果你跑步你会看到什么git cat-file -p <sha-1>
在提交上(加上标记和大小字段,表示“此对象属于提交类型”,因此没有简单的方法可以通过创建与先前提交内容相同的 blob 来破坏事物)。这是一个例子:
$ git cat-file -p 996b0fdbb4ff63bfd880b3901f054139c95611cf
tree e760f781f2c997fd1d26f2779ac00d42ca93f534
parent 6da748a7cebe3911448fabf9426f81c9df9ec54f
parent 740c281d21ef5b27f6f1b942a4f2fc20f51e8c7e
author Junio C Hamano <[email protected]> 1406140600 -0700
committer Junio C Hamano <[email protected]> 1406140600 -0700
Sync with v2.0.3
* maint:
Git 2.0.3
.mailmap: combine Stefan Beller's emails
git.1: switch homepage for stats
请注意,此字符串包括树及其 SHA-1、此提交的父 SHA-1、作者和时间戳、提交者和时间戳以及消息。如果你改变哪怕一个一位的这一点 - 例如通过尝试更改底层树,或使用一些不同的父提交 - 您将得到一个新的、不同的 SHA-1,而不是996b0fdbb4ff63bfd880b3901f054139c95611cf
.
所以这个问题的答案是:
因此,理论上,如果我和您在完全相同的时间使用完全相同的配置作者、电子邮件等执行完全相同的步骤,我们实际上会获得相同的提交 SHA 密钥吗?
is "yes". However ... you must start with the same staging area (this is what will become the tree
), and the same parent commits. If you then configure your author, email, etc., exactly the same as the other guy, and both of you create a new commit at the same second (or using git's environment variables2 to force the time stamps), you both get the same new commit.
这正是我们想要的。当你被命名为“我”时,你创建它,或者当我被命名为“我”时,我创建它,如果所有其余内容都相同,那么这并不重要。因为无论谁创造了它,另一个“我”都可以克隆它,然后我们都可以拥有相同的东西。
(如果我想确保创造某些东西的“我”不会与真实的我混淆,我需要添加一些独特的东西,我知道而另一个我不知道。当然,如果我在某个地方发布这个东西,我认识的其他人都知道。但这就是签名、注释标签的用途。它们可以包含 GPG 签名。)
1The chances of an accidental hash collision (for any pair of objects; chances rise with more objects) are 1 out of 2160, which is ... very small. :-) The rise is actually very rapid, so that by the time you have a million objects, it's about 1 out of 2121. The formula I use here is:
1 - 指数(((-(n * (n-1))) / (2 *r))
where r = 2160 and n is the number of objects. Without the subtraction from 1, the equation calculates the "safety margin", as it were: the chance that we won't have an accidental hash collision. If we want to keep this number in the same range as the safety margin that a disk drive won't read back the wrong contents for a file—or at least, that disk-makers claim—we need to keep it around 10-18, which means we need to avoid putting more than about 1.7 quadrillion (1.7E15) objects in our git databases.
2There are many git environment variables that you can set to override various defaults. The ones for the author and committer, including date and email, are:
- GIT_AUTHOR_NAME
- GIT_AUTHOR_EMAIL
- GIT_AUTHOR_DATE
- GIT_COMMITTER_NAME
- GIT_COMMITTER_EMAIL
- GIT_COMMITTER_DATE
- EMAIL
如中所述git 提交树文档.