A BitTorrent magnet link identifies a torrent using1 a SHA-1 or truncated SHA-256 hash value known as the "infohash". This is the same value that peers (clients) use to identify torrents when communicating with trackers or other peers. A traditional .torrent file contains a data structure with two top-level keys: announce
, identifying the tracker(s) to use for the download, and info
, containing the filenames and hashes for the torrent. The "infohash" is the hash of the encoded info
data.
Some magnet links include trackers or web seeds, but they often don't. Your client may know nothing about the torrent except for its infohash. The first thing it needs to is find other peers who are downloading the torrent. It does this using a separate peer-to-peer network2 operating a "distributed hash table" (DHT). A DHT is a big distributed index which maps torrents (identified by infohashes) to lists of peers (identified by IP address and ports) who are participating in a swarm for that torrent (uploading/downloading data or metadata).
The first time a client joins the DHT network it generates a random 160-bit ID from the same space as infohashes. It then bootstraps its connection to the DHT network using either hard-coded addresses of clients controlled by the client developer, or DHT-supporting clients previously encountered in a torrent swarm. When it wants to participate in a swarm for a given torrent, it searches the DHT network for several other clients whose IDs are as close3 as possible to the infohash. It notifies these clients that it would like to participate in the swarm, and asks them for the connection information of any peers they already know of who are participating in the swarm.
当对等点上传/下载特定 torrent 时,他们会尝试告诉对方他们所知道的正在参与同一 torrent 群的所有其他对等点。这让对等点可以快速了解彼此,而无需让跟踪器或 DHT 受到持续的请求。一旦您从 DHT 中了解了一些对等点,您的客户端将能够向这些对等点询问 torrent 群中更多对等点的连接信息,直到您拥有所需的所有对等点。
最后,我们可以向这些同行询问 torrent 的info
元数据,包含文件名和哈希列表。一旦我们下载了此信息并使用已知的方法验证其正确性infohash
,我们的处境与从常规开始的客户几乎相同.torrent
文件并从包含的跟踪器中获取对等点列表。
下载可能会开始。
1 The infohash is typically hex-encoded, but some old clients used base 32 instead. v1 (urn:btih:
) uses the SHA-1 digest directly, while v2 (urn:bimh:
) adds a multihash https://github.com/multiformats/multihash prefix to identify the hash algorithm and digest length.
2 There are two primary DHT networks: the simpler "mainline" DHT, and a more complicated protocol used by Azureus.
3 The distance is measured by XOR.
进一步阅读
- BEP-3:BitTorrent 协议规范 http://www.bittorrent.org/beps/bep_0003.html
- BEP-52:BitTorrent 协议规范 v2 http://www.bittorrent.org/beps/bep_0052.html
- BEP-5:DHT 协议 http://www.bittorrent.org/beps/bep_0005.html
- BEP-9:对等方发送元数据文件的扩展 http://www.bittorrent.org/beps/bep_0009.html
- BEP-10:扩展协议 http://www.bittorrent.org/beps/bep_0010.html
- BEP-11:同行交换(PEX) http://www.bittorrent.org/beps/bep_0011.html
- Azureus DHT 说明 http://wiki.vuze.com/w/Distributed_hash_table