我不知道git cat-file --batch
应该按照您在问题中提到的方式工作。
(可能会在 2016 年 3 月 git 2.8 之后,见下文)
即使在“GitMagic 书 http://www-cs-students.stanford.edu/%7Eblynn/gitmagic/book.pdf”,在 Unix 环境中,git cat-file
用法就像sinelaw https://stackoverflow.com/users/562906/sinelaw评论中提到:
通过键入以下内容检查该文件是否确实包含上述内容:
$ echo 05b217bb859794d08bb9e4f7f04cbda4b207fbe9 | git cat-file --batch
As the OP Alex.Shen https://stackoverflow.com/users/381646/alex-shen mentions above, this is an newline http://en.wikipedia.org/wiki/Newline issue:
git commands will alway expects LF
(Line Feed, U+000A), not the Windows CRLF
(CR
+LF
: CR
(U+000D) followed by LF
(U+000A)) sequence.
With the '|
', it uses the EOL
character of the bash msysgit shell (LF
), so it always works.
注意:Git 2.5+(2015 年第 2 季度)将添加对符号链接的支持git cat-file --batch
.
(适用于 Windows 的新 Git 版本可在github.com/git-for-windows/git/releases https://github.com/git-for-windows/git/releases)
See commit 122d534 https://github.com/git/git/commit/122d53464b29d3ac20891c5ee2f75ac5ecbb7b20 by David Turner (csusbdt) https://github.com/csusbdt, 20 May 2015.
(Merged by Junio C Hamano -- gitster -- https://github.com/gitster in commit 67f0b6f https://github.com/git/git/commit/67f0b6f3b2226ea858c616028375dcc3c46ccc37, 01 Jun 2015)
cat-file
: add --follow-symlinks
to --batch
"git cat-file --batch(-check)
“ 学到了 ”--follow-symlinks
“ 当询问某个问题时,该选项遵循树内符号链接
通过扩展 SHA-1 语法的对象。
E.g. HEAD:RelNotes
指向Documentation/RelNotes/2.5.0.txt
.
使用新选项,该命令的行为就像HEAD:Documentation/RelNotes/2.5.0.txt
而是作为输入给出。
2016 年 2 月更新:
Git 2.8 在一些 git 命令中添加了对 CRLF 的支持:
See commit a551843 https://github.com/git/git/commit/a551843129dc4d329d4f7915a9c10120963acd7d, commit 933bea9 https://github.com/git/git/commit/933bea922c202a7ff9b133a5cca4a805a7013aa2, commit 1536dd9 https://github.com/git/git/commit/1536dd9c1df0b7167b139f6666080cc4774ef63f, commit b42ca3d https://github.com/git/git/commit/b42ca3dd0f157d0c23c9a034bc68257e1748238a, commit 692dfdf https://github.com/git/git/commit/692dfdfa622c6286999609a4fef59724124ca794, commit 3f16396 https://github.com/git/git/commit/3f163962282d2d8bea914c32d81ad38544044f78, commit 18814d0 https://github.com/git/git/commit/18814d0e2d7d5924a799bcf0cae3a0aaba569613, commit 1f3b1ef https://github.com/git/git/commit/1f3b1efd18a935fed41431132c67cde5a94833ae, commit 72e37b6 https://github.com/git/git/commit/72e37b6ac851c3926956c9d11a40260f08bf1c5e, commit 6e8d46f https://github.com/git/git/commit/6e8d46f9d4bb3643b8fdf4e1b0856f7910dbc948, commit c0353c7 https://github.com/git/git/commit/c0353c78e80434b9385e134c69b54b048c382077 (28 Oct 2015) by Junio C Hamano (gitster) https://github.com/gitster.
(Merged by Junio C Hamano -- gitster -- https://github.com/gitster in commit 0175655 https://github.com/git/git/commit/017565525f969c26c43443d998a1283cac843d10, 03 Feb 2016)
尤其,提交 b42ca3d https://github.com/git/git/commit/b42ca3dd0f157d0c23c9a034bc68257e1748238a uses strbuf.c#strbuf_getline() https://github.com/git/git/blob/8f309aeb8225a9c26f20c0dbc031f1ea8df75d49/strbuf.c#L504-L533(它可以占用除LF
or NUL
作为行终止符)
使用 git 2.8:
cat-file
:读取批处理流strbuf_getline()
可以使用 DOS 编辑器准备一个文本文件并输入它
作为命令的批处理命令流.
Note that before Git 2.33 (Q3 2021), "git cat-file --batch-all-objects https://github.com/git/git/blob/5d96bcbc0602f96ccee3111ff93b05389cd6eae6/Documentation/git-cat-file.txt#L93"(man https://git-scm.com/docs/git-cat-file#Documentation/git-cat-file.txt---batch-all-objects) misbehaved when --batch
is in use and did not ask for certain object traits.
See commit ee02ac6 https://github.com/git/git/commit/ee02ac616435cb1da1e02c8b9c220649d3cec40a, commit e16acc8 https://github.com/git/git/commit/e16acc80a78ae5e931b94e861aff53a4af485f77 (03 Jun 2021) by ZheNing Hu (adlternative) https://github.com/adlternative.
(Merged by Junio C Hamano -- gitster -- https://github.com/gitster in commit 5d96bcb https://github.com/git/git/commit/5d96bcbc0602f96ccee3111ff93b05389cd6eae6, 13 Jul 2021)
cat-file https://github.com/git/git/commit/e16acc80a78ae5e931b94e861aff53a4af485f77: 处理琐碎的事情--batch
格式化为--batch-all-objects
Helped-by: Jeff King
Signed-off-by: ZheNing Hu
Acked-by: Jeff King
The --batch
打印对象的代码假设我们通过调用找到了对象的类型oid_object_info_extended()
.
对于默认格式来说是这样,但即使是自定义格式,我们也手动修改object_info
struct 来询问类型。
这个假设被打破了845de33 https://github.com/git/git/commit/845de33a5b2c9efb7761d091f1475ec89c25798a (cat-file
: 避免对 sha1_object_info_extend 进行 noop 调用,2016-05-18,Git v2.9.0-rc1 --merge https://github.com/git/git/commit/628991391dc7fd7a07772546de8e43684439aa73) (cat-file: 避免 noop 调用sha1_object_info_extended,
2016-05-18)。
该提交跳过了对oid_object_info_extended()
完全当--batch-all-objects
正在使用,并且自定义格式不包含任何需要调用它的占位符。
或者当自定义格式仅包含 %(objectname) 或 %(rest) 等占位符时,oid_object_info_extended()
将无法获取对象的类型。
当我们尝试确认类型未更改时,这会导致错误:
$ git cat-file --batch=batman --batch-all-objects
batman
fatal: object 0000239 changed type!?
并且还有其他微妙的影响(例如,我们无法传输 blob,因为我们一开始并没有意识到它是一个 blob)。
我们可以通过翻转设置的顺序来解决这个问题。
必须检查“我们是否需要获取对象信息”after我们已经决定是否需要查找类型。
With Git 2.36 (Q2 2022), "git cat-file https://github.com/git/git/blob/d169d51504cd527edac82a57c6624c5d16ecb7b5/Documentation/git-cat-file.txt"(man https://git-scm.com/docs/git-cat-file) learns --batch-command
mode, which is a more flexible interface than the existing "--batch
" or "--batch-check
" modes, to allow different kinds of inquiries made.
See commit 440c705 https://github.com/git/git/commit/440c705ea63253642c8f4761f80eb7a97a62b614, commit 4cf5d53 https://github.com/git/git/commit/4cf5d53b62a8e5fce64db97f830e00fa38bd0994, commit ac4e58c https://github.com/git/git/commit/ac4e58cab9a946d4c45f1db5ee7c79b6fb14bd67, commit a2c7552 https://github.com/git/git/commit/a2c75526d21939d5d4e36dbbd6093a5e2c14c39f (18 Feb 2022) by John Cai (john-cai) https://github.com/john-cai.
(Merged by Junio C Hamano -- gitster -- https://github.com/gitster in commit d169d51 https://github.com/git/git/commit/d169d51504cd527edac82a57c6624c5d16ecb7b5, 09 Mar 2022)
cat-file https://github.com/git/git/commit/440c705ea63253642c8f4761f80eb7a97a62b614: 添加 --batch-command 模式
Helped-by: Ævar Arnfjörð Bjarmason
Signed-off-by: John Cai
Add a new flag --batch-command
that accepts commands and arguments from stdin, similar to git-update-ref https://github.com/git/git/blob/440c705ea63253642c8f4761f80eb7a97a62b614/Documentation/git-update-ref.txt(man https://git-scm.com/docs/git-update-ref) --stdin.
在 GitLab,我们在访问对象内容时使用一对长时间运行的 cat 文件进程。
一种用于迭代对象元数据--batch-check
,另一个用 --batch 抓取对象内容。
However, if we had --batch-command
, we wouldn't need to keep both processes around, and instead just have one --batch-command
process where we can flip between getting object info, and getting object contents.
Since we have a pair of cat-file processes per repository, this means we can get rid of roughly half of long lived git cat-file https://github.com/git/git/blob/440c705ea63253642c8f4761f80eb7a97a62b614/Documentation/git-cat-file.txt(man https://git-scm.com/docs/git-cat-file) processes.
Given there are many repositories being accessed at any given time, this can lead to huge savings.
git cat-file --batch-command https://github.com/git/git/blob/440c705ea63253642c8f4761f80eb7a97a62b614/Documentation/git-cat-file.txt#L99(man https://git-scm.com/docs/git-cat-file#Documentation/git-cat-file.txt---batch-command)
将进入交互式命令模式,用户可以输入在内存中排队的命令及其参数:
<command1> [arg1] [arg2] LF
<command2> [arg1] [arg2] LF
When --buffer
使用模式时,命令将在内存中排队,直到发出执行它们的刷新命令:
flush LF
使用flush命令的原因是当消费者进程(A)
与一个人交谈git cat-file
过程(B)
并以交互方式写入和读取--buffer
mode, (A)
需要能够控制缓冲区何时刷新到标准输出。
目前,从(A)的角度来看,唯一的方法是
-
kill (B)
的过程
-
向 stdin 发送无效对象。
-
从性能角度来看并不理想,因为它每次都需要生成一个新的 cat 文件进程,而且 2. 很糟糕,不是一个好的长期解决方案。
通过这种排队命令并让(A)
发出刷新命令,处理(A)
可以控制何时刷新缓冲区,并可以保证在进入时它将接收到所有输出--buffer
mode.
--batch-command
也不会允许(B)
刷新到标准输出,直到收到刷新。
该补丁添加了添加命令的基本结构,将来可以扩展以添加更多命令。
它还添加了以下两个命令(在刷新命令之上):
contents `<object>` LF
info `<object>` LF
The contents
命令需要一个<object>
参数并打印出对象内容。
The info
命令需要一个<object>
参数并打印出对象元数据。
这些可以通过以下方式使用--buffer
:
info `<object>` LF
contents `<object>` LF
contents `<object>` LF
info `<object>` LF
flush LF
info `<object>` LF
flush LF
使用时不带--buffer
:
info `<object>` LF
contents `<object>` LF
contents `<object>` LF
info `<object>` LF
info `<object>` LF
git cat-file
现在包含在其man page https://github.com/git/git/blob/440c705ea63253642c8f4761f80eb7a97a62b614/Documentation/git-cat-file.txt#L99-L125:
--batch-command
--batch-command=<format>
进入从 stdin 读取命令和参数的命令模式。可能
只能与--buffer
, --textconv
or --filters
。在里面
的情况下--textconv
or --filters
,输入行还需要指定
路径,以空格分隔。参见 参考资料 部分BATCH OUTPUT
以下
了解详情。
--batch-command
识别以下命令:
--
contents <object>
打印对象内容以供对象引用<object>
。这对应于
的输出--batch
.
info <object>
打印对象引用的对象信息<object>
。这对应于
的输出--batch-check
.
flush
与使用--buffer
执行之前发出的所有命令
自开始或自上次刷新发出以来。什么时候--buffer
使用时,不会有输出,直到flush
已发出。什么时候--buffer
不使用,命令每次都会刷新而不发出flush
.
git cat-file
现在包含在其man page https://github.com/git/git/blob/440c705ea63253642c8f4761f80eb7a97a62b614/Documentation/git-cat-file.txt#L232-L238:
When --batch-command
给出,cat-file
将从标准输入读取命令,
每行一个,并根据给定的命令打印信息。和--batch-command
, the info
命令后跟一个对象将打印
以同样的方式获取有关该对象的信息--batch-check
会,并且contents
命令后跟一个对象以相同的方式打印内容--batch
would.
git cat-file
现在包含在其man page https://github.com/git/git/blob/440c705ea63253642c8f4761f80eb7a97a62b614/Documentation/git-cat-file.txt#L274-L276:
If --batch
已指定,或者如果--batch-command
与使用contents
命令,对象信息后面是对象内容(包括
的%(objectsize)
字节),后跟换行符。
With Git 2.38 (Q3 2022), operating modes like "--batch
" of "git cat-file https://github.com/git/git/blob/1e92768aa1771d5d326990855e7a582da541ef9b/Documentation/git-cat-file.txt"(man https://git-scm.com/docs/git-cat-file) command learned to take NUL-terminated input, instead of one-item-per-line.
See commit db9d67f https://github.com/git/git/commit/db9d67f2e9c9ea389f5558d6a168460d51631769, commit 3639fef https://github.com/git/git/commit/3639fefe7d1d7bf881ea6128cedc4fc503164edc (22 Jul 2022) by Taylor Blau (ttaylorr) https://github.com/ttaylorr.
(Merged by Junio C Hamano -- gitster -- https://github.com/gitster in commit 1e92768 https://github.com/git/git/commit/1e92768aa1771d5d326990855e7a582da541ef9b, 05 Aug 2022)
builtin/cat-file.c https://github.com/git/git/commit/db9d67f2e9c9ea389f5558d6a168460d51631769:支持 NUL 分隔输入-z
Signed-off-by: Taylor Blau
当呼叫者使用cat-file
通过标准输入驱动之一--batch
模式下,所有输入均以换行符分隔。
当呼叫者想要询问诸如“电话号码”之类的问题时,这就会出现问题。文件名中存在换行符的树条目.
为了支持这种利基场景,引入一个新的-z
模式到--batch
, --batch-check
, and --batch-command
的选项套件指示cat-file
将其输入视为 NULL 分隔,允许各个命令本身存在新行.
这里的重构有点不幸,因为我们将循环转变为:
while (strbuf_getline(&buf, stdin) != EOF)
into:
while (1) {
int ret;
if (opt->nul_terminated)
ret = strbuf_getline_nul(&input, stdin);
else
ret = strbuf_getline(&input, stdin);
if (ret == EOF)
break;
}
我们很容易想到我们可以使用strbuf_getwholeline()
并指定\n
or \0
作为终止字符。
但对于在 OF 之前包含 CR 字符的平台上的输入,这不会完全相同,因为strbuf_getline(...)
将修剪任何尾随的 CR,同时strbuf_getwholeline(&buf, stdin, '\n')
将不会。
git cat-file
现在包含在其man page https://github.com/git/git/blob/db9d67f2e9c9ea389f5558d6a168460d51631769/Documentation/git-cat-file.txt#L210-L214:
-z
只有有意义--batch
, --batch-check
, or
--batch-command
;输入是 NUL 分隔的而不是
换行符分隔。
With Git 2.42 (Q3 2023), "git cat-file --batch https://github.com/git/git/blob/a9ea4c23dc34bd0c27166f3445aba5a8696700c7/Documentation/git-cat-file.txt#L91"(man https://git-scm.com/docs/git-cat-file#Documentation/git-cat-file.txt---batch) and friends learned -Z that uses NUL delimiter for both input and output.
See commit f79e188 https://github.com/git/git/commit/f79e18849b5dd8ad83f38744a2e08370f919491a, commit 3217f52 https://github.com/git/git/commit/3217f52a499d5264d98065374a62deb15717b727, commit af35e56 https://github.com/git/git/commit/af35e56b0f83f872a8d82d8293fae87c80b491ef, commit b116c77 https://github.com/git/git/commit/b116c77307853a9fa8f74d1ad16bed98a7211207, commit c7309f6 https://github.com/git/git/commit/c7309f63c606cebd4dea8b2a31e5ffbb2e2bf205 (06 Jun 2023) by Patrick Steinhardt (pks-t) https://github.com/pks-t.
(Merged by Junio C Hamano -- gitster -- https://github.com/gitster in commit a9ea4c2 https://github.com/git/git/commit/a9ea4c23dc34bd0c27166f3445aba5a8696700c7, 22 Jun 2023)
cat-file https://github.com/git/git/commit/f79e18849b5dd8ad83f38744a2e08370f919491a:添加选项“-Z”,用 NUL 分隔输入和输出
Co-authored-by: Toon Claes
Signed-off-by: Patrick Steinhardt
In db9d67f https://github.com/git/git/commit/db9d67f2e9c9ea389f5558d6a168460d51631769 ("builtin/cat-file.c https://github.com/git/git/blob/db9d67f2e9c9ea389f5558d6a168460d51631769/builtin/cat-file.c:支持 NUL 分隔输入-z
”,2022-07-22,Git v2.38.0-rc0 --merge https://github.com/git/git/commit/1e92768aa1771d5d326990855e7a582da541ef9b列于第 10 批 https://github.com/git/git/commit/679aad9e82d0dfd8ef3d1f98fa4629665496cec9),我们引入了一种新模式来通过 NUL 分隔记录而不是换行符分隔记录读取输入。
这允许用户查询其路径组件中包含换行符的修订。
虽然不常见,但此类查询完全有效,因此很明显我们应该能够正确支持它们。
不幸的是,提交仅将输入更改为 NUL 分隔,但没有同时更改输出。
虽然这对于成功处理的查询来说很好,但对于未成功处理的查询则不然。
例如,在缺少提交的情况下,结果可能变得完全无法解析:
$ printf "7ce4f05bae8120d9fa258e854a8669f6ea9cb7b1 blob 10\n1234567890\n\n\commit000" |
git cat-file --batch -z
7ce4f05bae8120d9fa258e854a8669f6ea9cb7b1 blob 10
1234567890
commit missing
这当然是一个精心设计的查询,旨在弥补缺陷,但包含换行符的更良性的查询也会有类似的问题。
理想情况下,我们还应该将输出更改为 NUL 分隔的-z
被指定以避免这个问题。
由于输入是以 NULL 分隔的,因此很明显,这种情况下的输出本身不能包含 NULL 字符。
此外,Git 无论如何都不允许在修订版中使用 NUL 字符,这进一步强调了使用 NUL 分隔的输出是安全的。
唯一的例外当然是对象数据本身,但是当 git-cat-file(1) 打印客户端应该读取的对象数据的大小,直到消耗指定的大小为止。
但即使-z
仅在几个版本之前的 Git v2.38.0 中引入,将输出格式追溯更改为 NUL 分隔输出将是向后不兼容的更改。
虽然有人可能会说输出本质上已经被破坏了,但我们需要假设现有的用户可以很好地使用它,因为包含换行符的修订是相当奇特的。
相反,引入一个新选项-Z
切换到 NUL 分隔的输入和输出。
虽然这个新选项可以说只能将输出格式切换为 NUL 分隔,但结果是用户必须始终指定两者-z
and -Z
当输入可能包含换行符时。
另一方面,如果用户知道输入中永远不会有换行符,则他们不必使用这两个选项中的任何一个。
因此,不存在需要单独处理输入和输出格式的用例,这就是为什么我们选择“做正确的事情”并有-Z
意味着两种格式都以 NUL 结尾。
The old -z
选项被标记为已弃用,并提示其输出可能变得无法解析。
因此,它在概要和命令的帮助输出中都是隐藏的。
git cat-file
现在包含在其man page https://github.com/git/git/blob/f79e18849b5dd8ad83f38744a2e08370f919491a/Documentation/git-cat-file.txt#L246-L250:
-Z
只有有意义--batch
, --batch-check
, or
--batch-command
;输入和输出是 NUL 分隔的而不是
换行符分隔。
git cat-file
现在包含在其man page https://github.com/git/git/blob/f79e18849b5dd8ad83f38744a2e08370f919491a/Documentation/git-cat-file.txt#L254-L255:
换行符分隔。此选项已被弃用,有利于-Z
否则输出可能会不明确。
git cat-file
现在包含在其man page https://github.com/git/git/blob/f79e18849b5dd8ad83f38744a2e08370f919491a/Documentation/git-cat-file.txt#L393-L397:
或者,当-Z
通过后,上述任何示例中的换行符
被 NUL 终止符替换。这确保了输出将是可解析的,如果
输出本身将包含换行符,因此建议用于
编写脚本的目的。