注:这个答案是关于Windows PowerShell(至 v5.1);PowerShell [核心,v6+], the 跨平台幸运的是,现在有 PowerShell 版本默认为BOM-less UTF-8在输入和输出上。
Windows PowerShell, unlike the underlying .NET Framework[1]
, uses the following defaults:
-
on input: files 没有 BOM(字节顺序标记)假设在系统的default编码, 哪一个是legacy Windows 代码页 https://en.wikipedia.org/wiki/Windows_code_page(“ANSI”代码页:活跃的、特定于文化的单字节编码,通过控制面板配置)。
-
on output: the >
and >>
重定向运算符产生UTF-16 LE默认情况下的文件(确实有并且需要 BOM)。
文件消耗和生成 cmdlet 执行以下操作通常支持一个-Encoding
范围它允许您显式指定编码。
Windows PowerShell v5.1 之前的版本,使用底层Out-File
cmdlet 明确是更改编码的唯一方法。
In Windows PowerShell v5.1+, >
and >>
成为有效的别名Out-File
,允许您更改的编码行为>
and >>
通过$PSDefaultParameterValues
偏好变量;例如。:
$PSDefaultParameterValues['Out-File:Encoding'] = 'utf8'
.
For Windows PowerShell to handle UTF-8 properly, you must specify it as both the input and output encoding[2]
, but note that on output, PowerShell invariably adds a BOM to UTF-8 files.
应用于您的示例:
Get-Content -Encoding utf8 .\utf8.txt | Out-File -Encoding utf8 out.txt
创建 UTF-8 文件withoutPowerShell 中的 BOM,请参阅这个答案 https://stackoverflow.com/a/34969243/45375我的。
[1] .NET Framework uses (BOM-less) UTF-8 by default, both for in- and output.
This - intentional - difference in behavior between Windows PowerShell and the framework it is built on is unusual. The difference went away in PowerShell [Core] v6+: both .NET [Core] and PowerShell [Core] default to BOM-less UTF-8.
[2] Get-Content
does, however, automatically recognize UTF-8 files with a BOM.