我想创建一个程序来爬行并检查我的网站是否有 http 错误和其他内容。
我想使用多个线程来执行此操作,这些线程应该接受要抓取的 url 等参数。
虽然我希望 X 线程处于活动状态,但仍有 Y 任务正在等待执行。
现在我想知道执行此操作的最佳策略是什么:线程池、任务、线程还是其他策略?
下面的示例展示了如何对一堆任务进行排队,但限制同时运行的数量。它使用一个Queue
跟踪准备运行的任务并使用Dictionary
跟踪正在运行的任务。当任务完成时,它会调用回调方法将其自身从任务中删除Dictionary
. An async
方法用于在空间可用时启动排队任务。
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
namespace MinimalTaskDemo
{
class Program
{
private static readonly Queue<Task> WaitingTasks = new Queue<Task>();
private static readonly Dictionary<int, Task> RunningTasks = new Dictionary<int, Task>();
public static int MaxRunningTasks = 100; // vary this to dynamically throttle launching new tasks
static void Main(string[] args)
{
var tokenSource = new CancellationTokenSource();
var token = tokenSource.Token;
Worker.Done = new Worker.DoneDelegate(WorkerDone);
for (int i = 0; i < 1000; i++) // queue some tasks
{
// task state (i) will be our key for RunningTasks
WaitingTasks.Enqueue(new Task(id => new Worker().DoWork((int)id, token), i, token));
}
LaunchTasks();
Console.ReadKey();
if (RunningTasks.Count > 0)
{
lock (WaitingTasks) WaitingTasks.Clear();
tokenSource.Cancel();
Console.ReadKey();
}
}
static async void LaunchTasks()
{
// keep checking until we're done
while ((WaitingTasks.Count > 0) || (RunningTasks.Count > 0))
{
// launch tasks when there's room
while ((WaitingTasks.Count > 0) && (RunningTasks.Count < MaxRunningTasks))
{
Task task = WaitingTasks.Dequeue();
lock (RunningTasks) RunningTasks.Add((int)task.AsyncState, task);
task.Start();
}
UpdateConsole();
await Task.Delay(300); // wait before checking again
}
UpdateConsole(); // all done
}
static void UpdateConsole()
{
Console.Write(string.Format("\rwaiting: {0,3:##0} running: {1,3:##0} ", WaitingTasks.Count, RunningTasks.Count));
}
// callback from finished worker
static void WorkerDone(int id)
{
lock (RunningTasks) RunningTasks.Remove(id);
}
}
internal class Worker
{
public delegate void DoneDelegate(int taskId);
public static DoneDelegate Done { private get; set; }
private static readonly Random Rnd = new Random();
public async void DoWork(object id, CancellationToken token)
{
for (int i = 0; i < Rnd.Next(20); i++)
{
if (token.IsCancellationRequested) break;
await Task.Delay(100); // simulate work
}
Done((int)id);
}
}
}
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)