用MATLAB和内点法实现带有时变不等式约束的分布式优化

2023-11-15

文章目录

问题描述
内点法
MATLAB实现
- 仿真结果
- 源代码

问题描述

考虑代价函数
f i ( x i , t ) = ∥ x i − x i ∗ ( t ) ∥ 2 . f_i(x_i,t)=\|x_i-x_i^*(t)\|^2. fi(xi,t)=∥xi−xi∗(t)∥2.
求解一个简单的优化问题实例：
min ⁡ 1 2 ∑ i = 1 n f i ( x i , t ) s.t. ⁡ f i ( x i , t ) ≤ 1 x i − x j = 0 \begin{aligned} \min &\quad \frac{1}{2}\sum_{i=1}^n f_i(x_i,t)\\ \operatorname{s.t.} &\quad f_i(x_i,t)\leq 1\\ &\quad x_i-x_j=0 \end{aligned} mins.t.21i=1∑nfi(xi,t)fi(xi,t)≤1xi−xj=0

如下图所示，参考位置 x i ∗ ( t ) x_i^*(t) xi∗(t)由菱形表示，初始位置 x i ( 0 ) x_i(0) xi(0)由小圆圈表示，不等式约束 f i ( x i , t ) ≤ 1 f_i(x_i,t)\leq 1 fi(xi,t)≤1由大圆圈表示。

内点法

设计barrier构建带惩罚项的代价函数
L i ( x , t ) = f i ( x , t ) − 1 ρ ( t ) log ⁡ ( 1 − ρ ( t ) ( f i ( x i , t ) − 1 ) ) , ρ ( t ) = a 1 e a 2 t , a 1 , a 2 > 0 ， L_i(x,t) = f_i(x,t)-\frac{1}{\rho(t)}\log(1-\rho(t)(f_i(x_i,t)-1)),\quad \rho(t)=a_1e^{a_2t},\quad a_1,a_2>0， Li(x,t)=fi(x,t)−ρ(t)1log(1−ρ(t)(fi(xi,t)−1)),ρ(t)=a1ea2t,a1,a2>0，
其中，梯度、Hessian和关于时间的变化率为
∇ L i ( x i , t ) = ∇ f i ( x i , t ) + ∇ f i ( x i , t ) 1 − ρ ( t ) ( f i ( x i , t ) − 1 ) ∂ ∂ t ∇ L i ( x i , t ) = ∂ ∂ t ∇ f i ( x i , t ) + ∂ ∇ f i ( x i , t ) / ∂ t 1 − ρ ( t ) ( f i ( x i , t ) − 1 ) + ( ρ ˙ ( t ) ( f i ( x i , t ) − 1 ) + ρ ( t ) ∂ f i ( x i , t ) / ∂ t ) ∇ f i ( x i , t ) ( 1 − ρ ( t ) ( f i ( x i , t ) − 1 ) ) 2 ∇ 2 L i ( x i , t ) = ∇ 2 f i ( x i , t ) + ∇ 2 f i ( x i , t ) 1 − ρ ( t ) ( f i ( x i , t ) − 1 ) + ρ ( t ) ∇ f i ( x i , t ) ∇ f i ( x i , t ) T 1 − ρ ( t ) ( f i ( x i , t ) − 1 ) \begin{aligned} \nabla L_i(x_i,t)&=\nabla f_i(x_i,t)+\frac{\nabla f_i(x_i,t)}{1-\rho(t)(f_i(x_i,t)-1)}\\ \frac{\partial}{\partial t}\nabla L_i(x_i,t)&=\frac{\partial}{\partial t}\nabla f_i(x_i,t)+\frac{\partial\nabla f_i(x_i,t)/\partial t}{1-\rho(t)(f_i(x_i,t)-1)}\\ &\quad+\frac{(\dot \rho(t)(f_i(x_i,t)-1)+\rho(t)\partial f_i(x_i,t)/\partial t)\nabla f_i(x_i,t)}{(1-\rho(t)(f_i(x_i,t)-1))^2}\\ \nabla^2 L_i(x_i,t)&=\nabla^2 f_i(x_i,t)+\frac{\nabla^2 f_i(x_i,t)}{1-\rho(t)(f_i(x_i,t)-1)}+\frac{\rho(t)\nabla f_i(x_i,t)\nabla f_i(x_i,t)^T}{1-\rho(t)(f_i(x_i,t)-1)} \end{aligned} ∇Li(xi,t)∂t∂∇Li(xi,t)∇2Li(xi,t)=∇fi(xi,t)+1−ρ(t)(fi(xi,t)−1)∇fi(xi,t)=∂t∂∇fi(xi,t)+1−ρ(t)(fi(xi,t)−1)∂∇fi(xi,t)/∂t+(1−ρ(t)(fi(xi,t)−1))2(ρ˙(t)(fi(xi,t)−1)+ρ(t)∂fi(xi,t)/∂t)∇fi(xi,t)=∇2fi(xi,t)+1−ρ(t)(fi(xi,t)−1)∇2fi(xi,t)+1−ρ(t)(fi(xi,t)−1)ρ(t)∇fi(xi,t)∇fi(xi,t)T
参考Sun 2020¹，设计如下控制律
u i = − β ( ∇ 2 L i ( x i , t ) ) − 1 ∑ j ∈ N i sgn ⁡ ( x i − x j ) + ϕ i ( t ) ϕ i ( t ) = − ( ∇ 2 L i ( x i , t ) ) − 1 ( ∇ L i ( x i , t ) + ∂ ∂ t ∇ L i ( x i , t ) ) \begin{aligned} u_i&=-\beta(\nabla^2L_i(x_i,t))^{-1}\sum_{j\in\mathcal N_i}\operatorname{sgn}(x_i-x_j)+\phi_i(t)\\ \phi_i(t)&=-(\nabla^2L_i(x_i,t))^{-1}\left(\nabla L_i(x_i,t)+\frac{\partial}{\partial t}\nabla L_i(x_i,t) \right) \end{aligned} uiϕi(t)=−β(∇2Li(xi,t))−1j∈Ni∑sgn(xi−xj)+ϕi(t)=−(∇2Li(xi,t))−1(∇Li(xi,t)+∂t∂∇Li(xi,t))

MATLAB实现

下面给出MATLAB仿真结果和源代码。

仿真结果

仿真结果动图如下所示。

每一个 x i x_i xi的轨迹如下图所示，可以看出所有agent的状态一致且收敛到优化问题的解。

源代码

robot_num = 5;

% reference
angle_central = 2*pi/robot_num;
if ~mod(robot_num,2)
    pos_ref = [cos([0 kron(1:(robot_num-1)/2,[1,-1]) robot_num/2]*angle_central)',...
        sin([0 kron(1:(robot_num-1)/2,[1,-1]) robot_num/2]*angle_central)']/2;
else
    pos_ref = [cos([0 kron(1:(robot_num-1)/2,[1,-1])]*angle_central)',...
        sin([0 kron(1:(robot_num-1)/2,[1,-1])]*angle_central)']/2;
end

% topology
graph = selectTopology(robot_num,pos_ref);

% initial position
pos_rob = pos_ref*rot2(2*pi/3);

% initial control
u_rob = zeros(size(pos_rob));

% reference control
u_ref_func = @(t) pos_ref./normby(pos_ref,1)*sin(pi*t)/2.*(-1).^(0:robot_num-1)';
u_ref = u_ref_func(0);

% plot
color_list = ['b','g','m','r','c'];
for i=1:robot_num
    hg_pos_rob(i) = plot(pos_rob(i,1),pos_rob(i,2),[color_list(i) 'o']); hold on
    pos_circ = circle_(pos_ref(i,:),1);
    hg_pos_circ(i) = plot(pos_circ(:,1),pos_circ(:,2),color_list(i));
    hg_pos_ref(i) = plot(pos_ref(i,1),pos_ref(i,2),[color_list(i) 'd']); 
    hg_u_ref(i) = quiver(pos_ref(i,1),pos_ref(i,2),u_ref(i,1),u_ref(i,2),'color',color_list(i),'LineStyle', '--');
    axis([-2 2 -2 2])
end
hold off

% functions
a1 = 1; a2 = 1; rho = a1*exp(a2*0); cost_hessian = 2*eye(2);
cost_ref = @(i,pos_rob,pos_ref) (pos_rob(i,:)-pos_ref(i,:))*(pos_rob(i,:)-pos_ref(i,:))';
cost_ref_dot = @(i,pos_rob,u_ref) -2*(pos_rob(i,:)-pos_ref(i,:))*u_ref(i,:)';
cost_ref_nabla = @(i,pos_rob,pos_ref) 2*(pos_rob(i,:)-pos_ref(i,:))';
cost_ref_nabla_dot = @(i,u_ref) -2*u_ref(i,:)';
cost_penalty = @(i,pos_rob,rho) cost_ref(i,pos_rob,pos_ref)-(1/rho)*log(1-rho*(cost_ref(i,pos_rob,pos_ref)-1));
cost_penalty_nabla = @(i,pos_rob,pos_ref,rho) cost_ref_nabla(i,pos_rob,pos_ref)*(1+1/(1-rho*(cost_ref(i,pos_rob,pos_ref)-1)));
cost_penalty_nabla_dot = @(i,pos_rob,pos_ref,rho) cost_ref_nabla_dot(i,u_ref)*(1+1/(1-rho*(cost_ref(i,pos_rob,pos_ref)-1)))+...
    rho*(a2*(cost_ref(i,pos_rob,pos_ref)-1)+cost_ref_dot(i,pos_rob,u_ref))/(1-rho*(cost_ref(i,pos_rob,pos_ref)-1))^2;
cost_penalty_hessian = @(i,pos_rob,pos_ref,rho) cost_hessian*(1+1/(1-rho*(cost_ref(i,pos_rob,pos_ref)-1)))+...
    rho*cost_ref_nabla(i,pos_rob,pos_ref)*cost_ref_nabla(i,pos_rob,pos_ref)'/(1-rho*(cost_ref(i,pos_rob,pos_ref)-1));
% test error
cost_ref(i,pos_rob,pos_ref);
cost_ref_dot(i,pos_rob,u_ref);
cost_ref_nabla(i,pos_rob,pos_ref);
cost_ref_nabla_dot(i,u_ref);
cost_penalty(i,pos_rob,rho);
cost_penalty_nabla(i,pos_rob,pos_ref,rho);
cost_penalty_nabla_dot(i,pos_rob,pos_ref,rho);
cost_penalty_hessian(i,pos_rob,pos_ref,rho);
disp('All functions are correct.');

% simulation
pos_data = pos_rob;
dt = 0.005;
T = 10;
loop = 0;
playspeed = 4;
video_on = true;
for t=0:dt:T
    loop = loop+1;
    % control
    u_ref = u_ref_func(t);
    beta = 1; hessian_inv = []; u_rob_tp = u_rob';
    for i=1:robot_num
       phi(:,i) = -cost_penalty_hessian(i,pos_rob,pos_ref,rho)^-1*...
           (cost_penalty_nabla(i,pos_rob,pos_ref,rho)+cost_penalty_nabla_dot(i,pos_rob,pos_ref,rho));
       hessian_inv = blkdiag(hessian_inv,cost_penalty_hessian(i,pos_rob,pos_ref,rho)^-1);
    end
    sign_rob = (graph.incidence*sign(graph.incidence'*pos_rob))';
    u_rob_tp(:) = -beta*hessian_inv*sign_rob(:)+phi(:);
    u_rob = u_rob_tp';
    % update
    pos_ref = pos_ref+dt*u_ref;
    pos_rob = pos_rob+dt*u_rob;
    pos_data(:,:,loop) = pos_rob;
    % plot
    for i=1:robot_num
        set(hg_pos_rob(i),'xdata',pos_rob(i,1),'ydata',pos_rob(i,2)); 
        pos_circ = circle_(pos_ref(i,:),1);
        set(hg_pos_circ(i),'xdata',pos_circ(:,1),'ydata',pos_circ(:,2));
        set(hg_pos_ref(i),'xdata',pos_ref(i,1),'ydata',pos_ref(i,2));
        set(hg_u_ref(i),'xdata',pos_ref(i,1),'ydata',pos_ref(i,2),...
            'udata',u_ref(i,1),'vdata',u_ref(i,2));
        axis([-2 2 -2 2])
    end
    % video
    if mod(loop,playspeed)==0&&video_on
        frame(loop/playspeed) = getframe(gcf);
    end
    drawnow
end

% write video
if video_on 
    savevideo('video',frame);
end

% result
figure 
t_data = 0:dt:T;
xpos_data = squeeze(pos_data(:,1,:));
ypos_data = squeeze(pos_data(:,2,:));
plot3(kron(ones(robot_num,1),t_data)',xpos_data',ypos_data');
xlabel('time/s');ylabel('x/m');zlabel('y/m');grid

源代码我已上传至我的GitHub，见项目paper-simulation。运行Sun2020Distributed文件夹下的文件time_varying_optimization.m，即可得到上面的仿真结果。代码中部分函数用到了RTB，下载和安装说明点击这里。

如果喜欢，欢迎点赞和fork。

Sun, S., & Ren, W. (2020). Distributed Continuous-Time Optimization with Time-Varying Objective Functions and Inequality Constraints. Retrieved from http://arxiv.org/abs/2009.02378 ↩︎

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)