英文题目 |
中文题目 |
|
FaceForensics++: Learning to Detect Manipulated Facial Images |
FaceForensics++:学习检测操纵的面部图像
|
|
DeepVCP: An End-to-End Deep Neural Network for Point Cloud Registration |
DeepVCP:用于点云配准的端到端深度神经网络
|
|
Shape Reconstruction Using Differentiable Projections and Deep Priors |
基于可微投影和深度先验的形状重建
|
|
Fine-Grained Segmentation Networks: Self-Supervised Segmentation for Improved Long-Term Visual Localization |
细粒度分割网络:基于自监督分割的长期视觉定位性能提升 |
|
SANet: Scene Agnostic Network for Camera Localization |
SANet:基于场景不可知网络的摄像机定位
|
|
Total Denoising: Unsupervised Learning of 3D Point Cloud Cleaning |
全消噪:三维点云清理的无监督学习
|
|
Hierarchical Self-Attention Network for Action Localization in Videos |
视频动作定位的分层自关注网络
|
|
Goal-Driven Sequential Data Abstraction |
目标驱动的顺序数据抽象
|
|
Jointly Aligning Millions of Images With Deep Penalised Reconstruction Congealing |
基于深度惩罚重建凝结的数百万张图片联合对齐
|
|
Drop to Adapt: Learning Discriminative Features for Unsupervised Domain Adaptation |
放弃适应:基于判别特征学习的非监督域适应
|
|
NLNL: Negative Learning for Noisy Labels |
NLNL:噪声标签的负学习
|
|
Adversarial Robustness vs. Model Compression, or Both? |
对抗性稳健Vs.模型压缩,或两者兼而有之? |
|
On the Design of Black-Box Adversarial Examples by Leveraging Gradient-Free Optimization and Operator Splitting Method |
利用无梯度优化和算子分裂方法设计黑盒对抗实例
|
|
DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks |
DewarpNet:使用叠加三维和二维回归网络的单图像文档去弯曲
|
|
Learning Robust Facial Landmark Detection via Hierarchical Structured Ensemble |
基于层次结构集成的鲁棒人脸Landmark检测
|
|
Remote Heart Rate Measurement From Highly Compressed Facial Videos: An End-to-End Deep Learning Solution With Video Enhancement |
从高度压缩的面部视频中进行远程心率测量:一种具有视频增强功能的端到端深度学习解决方案
|
|
Face-to-Parameter Translation for Game Character Auto-Creation |
面向参数转换的游戏角色自动生成
|
|
Visual Deprojection: Probabilistic Recovery of Collapsed Dimensions |
可视化反投影:坍塌维度的概率恢复
|
|
StructureFlow: Image Inpainting via Structure-Aware Appearance Flow |
结构流:通过结构感知的外观流进行图像修复
|
|
Learning Fixed Points in Generative Adversarial Networks: From Image-to-Image Translation to Disease Detection and Localization |
GAN中的不动点学习:从图像-图像转换到疾病检测与定位
|
|
Generative Adversarial Training for Weakly Supervised Cloud Matting |
基于生成性对抗训练的弱监督云Matting(抠图?)
|
|
PAMTRI: Pose-Aware Multi-Task Learning for Vehicle Re-Identification Using Highly Randomized Synthetic Data |
PAMTRI:基于高度随机综合数据的姿态感知多任务学习实现车辆再识别
|
|
Generative Adversarial Networks for Extreme Learned Image Compression |
用于极端学习图像压缩的GAN
|
|
Instance-Guided Context Rendering for Cross-Domain Person Re-Identification |
基于实例引导上下文呈现的跨域人再识别
|
|
What Else Can Fool Deep Learning? Addressing Color Constancy Errors on Deep Neural Network Performance |
还有什么可以愚弄深度学习?深度神经网络性能的色彩恒常性误差处理
|
|
Beyond Cartesian Representations for Local Descriptors |
超越笛卡尔表示的局部描述符
|
|
Distilling Knowledge From a Deep Pose Regressor Network |
从深度姿态回归网络中提取知识
|
|
Instance-Level Future Motion Estimation in a Single Image Based on Ordinal Regression |
基于序贯回归的单帧图像实例级未来运动估计
|
|
Vision-Infused Deep Audio Inpainting |
视觉注入深度音频修复
|
|
HAWQ: Hessian AWare Quantization of Neural Networks With Mixed-Precision |
HAWQ:利用混合精度实现神经网络的Hessian感知量化
|
|
Evaluating Robustness of Deep Image Super-Resolution Against Adversarial Attacks |
深度图像超分辨率的抗对抗攻击鲁棒性评估
|
|
Overcoming Catastrophic Forgetting With Unlabeled Data in the Wild |
利用野外无标签数据克服灾难性遗忘
|
|
Symmetric Cross Entropy for Robust Learning With Noisy Labels |
带噪声标签鲁棒学习的对称交叉熵
|
|
Few-Shot Learning With Embedded Class Models and Shot-Free Meta Training |
基于嵌入式类模型和无镜头元训练的少镜头学习
|
|
Dual Directed Capsule Network for Very Low Resolution Image Recognition |
用于超低分辨率图像识别的双向胶囊网络
|
|
Recognizing Part Attributes With Insufficient Data |
利用不足数据识别部分属性
|
|
USIP: Unsupervised Stable Interest Point Detection From 3D Point Clouds |
USIP:三维点云的无监督稳定兴趣点检测
|
|
Mixed High-Order Attention Network for Person Re-Identification |
混合高阶注意网络用于人再识别
|
|
Budget-Aware Adapters for Multi-Domain Learning |
用于多域学习的预算感知适配器
|
|
Compact Trilinear Interaction for Visual Question Answering |
视觉问答的紧凑三线交互
|
|
Towards Latent Attribute Discovery From Triplet Similarities |
基于三元相似性的潜在属性发现
|
|
GeoStyle: Discovering Fashion Trends and Events |
GeoStyle:发现时尚趋势和事件
|
|
Towards Adversarially Robust Object Detection |
对抗性鲁棒目标检测
|
|
Automatic and Robust Skull Registration Based on Discrete Uniformization |
基于离散均匀化的自动鲁棒颅骨配准
|
|
Few-Shot Image Recognition With Knowledge Transfer |
基于知识迁移的少镜头图像识别
|
|
Fine-Grained Action Retrieval Through Multiple Parts-of-Speech Embeddings |
基于多重部分语音嵌入的细粒度动作检索
|
|
Vehicle Re-Identification in Aerial Imagery: Dataset and Approach |
航空影像中的车辆再识别:数据集与方法
|
|
Bridging the Domain Gap for Ground-to-Aerial Image Matching |
地-空图像匹配中的域间隙桥接
|
|
A Robust Learning Approach to Domain Adaptive Object Detection |
一种鲁棒学习的域自适应目标检测方法 |
|
Graph-Based Object Classification for Neuromorphic Vision Sensing |
基于图的对象分类实现神经形态视觉感知
|
|
Gaussian YOLOv3: An Accurate and Fast Object Detector Using Localization Uncertainty for Autonomous Driving |
高斯YOLOv3:自主驾驶中一种基于定位不确定性的快速精确目标检测方法 |
|
Sharpen Focus: Learning With Attention Separability and Consistency |
集中注意力:学习中注意力的可分离性和一致性
|
|
Learning Semantic-Specific Graph Representation for Multi-Label Image Recognition |
多标签图像识别中的特定语义图表示学习
|
|
DeceptionNet: Network-Driven Domain Randomization |
DeceptionNet:网络驱动的域随机化
|
|
Pose-Guided Feature Alignment for Occluded Person Re-Identification |
基于姿态引导的特征对齐实现遮挡人再识别
|
|
Robust Person Re-Identification by Modelling Feature Uncertainty |
基于特征不确定性建模的鲁棒人再识别
|
|
Co-Segmentation Inspired Attention Networks for Video-Based Person Re-Identification |
基于共分割启发的注意网络用于基于视频的人再识别
|
|
A Delay Metric for Video Object Detection: What Average Precision Fails to Tell |
视频目标检测的一种延迟度量:平均精度不能判断什么 |
|
IL2M: Class Incremental Learning With Dual Memory |
IL2M:双记忆课堂增量学习
|
|
Asymmetric Non-Local Neural Networks for Semantic Segmentation |
非对称非局部神经网络用于语义分割
|
语义分割网中嵌入NonLocal-Block,并将其改进为非对称NonLocal-Block,并进一步添加金字塔池化和多级融合技术(见框图) |
CCNet: Criss-Cross Attention for Semantic Segmentation |
CCNet:基于交叉注意的语义分割
|
利用十字(criss-cross)方式,高效地获取全局上下文信息 |
Convex Shape Prior for Multi-Object Segmentation Using a Single Level Set Function |
基于单水平集函数的凸形状先验多目标分割
|
|
Feature Weighting and Boosting for Few-Shot Segmentation |
基于特征加权和boosting的少镜头分割
|
|
Surface Networks via General Covers |
通过一般覆盖的地面网络
|
|
SSAP: Single-Shot Instance Segmentation With Affinity Pyramid |
SSAP:基于相似金字塔的单镜头实例分割
|
先进行(多尺度)语义分割(S),同时获得(多尺度)像素对关系(A),最后将不同尺度的A和S利用图割的方式,融合在一起,得到实例分割 |
Learning Propagation for Arbitrarily-Structured Data |
面向任意结构数据的学习传播
|
|
MultiSeg: Semantically Meaningful, Scale-Diverse Segmentations From Minimal User Input |
MultiSeg:从最小的用户输入实现语义上有意义,尺度分散的分割
|
|
Robust Motion Segmentation From Pairwise Matches |
基于成对匹配的鲁棒运动分割
|
|
InstaBoost: Boosting Instance Segmentation via Probability Map Guided Copy-Pasting |
InstaBoost:通过概率地图引导的复制粘贴实现增强实例分割
|
利用Copy-Paste的方法,实现训练样本集的增强 |
Racial Faces in the Wild: Reducing Racial Bias by Information Maximization Adaptation Network |
荒野中的种族面孔:信息最大化自适应网络减少种族偏移
|
|
Uncertainty Modeling of Contextual-Connections Between Tracklets for Unconstrained Video-Based Face Recognition |
Tracklet间上下文关系的不确定性建模实现无约束视频人脸识别
|
|
Spatio-Temporal Fusion Based Convolutional Sequence Learning for Lip Reading |
基于时空融合的卷积序列学习实现唇读
|
|
Occlusion-Aware Networks for 3D Human Pose Estimation in Video |
视频中三维人体姿态估计的遮挡感知网络
|
|
Context-Aware Feature and Label Fusion for Facial Action Unit Intensity Estimation With Partially Labeled Data |
利用部分标记数据实现基于上下文感知特征与标签融合的人脸动作单元强度估计
|
|
Distill Knowledge From NRSfM for Weakly Supervised 3D Pose Learning |
基于NRSfM的知识蒸馏实现弱监督三维姿态学习
|
|
MONET: Multiview Semi-Supervised Keypoint Detection via Epipolar Divergence |
基于极线散度的多视图半监督关键点检测
|
|
Talking With Hands 16.2M: A Large-Scale Dataset of Synchronized Body-Finger Motion and Audio for Conversational Motion Analysis and Synthesis |
手语16.2m:用于会话运动分析和合成的大规模体-指同步运动和音频数据集
|
|
Occlusion Robust Face Recognition Based on Mask Learning With Pairwise Differential Siamese Network |
基于成对差分孪生网络的模板学习实现遮挡鲁棒人脸识别
|
|
Teacher Supervises Students How to Learn From Partially Labeled Images for Facial Landmark Detection |
教师指导学生如何从部分标记的图像中学习面部Landmark检测
|
|
A2J: Anchor-to-Joint Regression Network for 3D Articulated Pose Estimation From a Single Depth Image |
A2J:锚定联合回归网络用于单深度图像三维关节姿态估计
|
|
TexturePose: Supervising Human Mesh Estimation With Texture Consistency |
基于纹理一致性的人体网格估计监控
|
|
FreiHAND: A Dataset for Markerless Capture of Hand Pose and Shape From Single RGB Images |
FreiHAND:一个从单个RGB图像中无标记捕捉手部姿势和形状的数据集
|
|
Markerless Outdoor Human Motion Capture Using Multiple Autonomous Micro Aerial Vehicles |
多自主微型飞行器无标记室外人体运动捕捉
|
|
Toyota Smarthome: Real-World Activities of Daily Living |
丰田智能家居:现实生活中的日常生活活动
|
|
Relation Parsing Neural Network for Human-Object Interaction Detection |
关系解析神经网络在人机交互检测中的应用 |
|
DistInit: Learning Video Representations Without a Single Labeled Video |
DistInit:学习没有单个标记视频的视频表示
|
|
Zero-Shot Anticipation for Instructional Activities |
教学活动的零镜头预期
|
|
Making the Invisible Visible: Action Recognition Through Walls and Occlusions |
使隐形可见:通过墙和遮挡的动作识别
|
|
Recursive Visual Sound Separation Using Minus-Plus Net |
用Minus-Plus Net进行递归可视声音分离
|
|
Unsupervised Video Interpolation Using Cycle Consistency |
基于循环一致性的无监督视频插值
|
|
Deformable Surface Tracking by Graph Matching |
基于图匹配的变形曲面跟踪
|
|
Deep Meta Learning for Real-Time Target-Aware Visual Tracking |
基于深度元学习的实时目标感知视觉跟踪
|
|
Looking to Relations for Future Trajectory Forecast |
展望未来轨迹预测的关系
|
|
Anchor Diffusion for Unsupervised Video Object Segmentation |
无监督视频对象分割的锚扩散算法
|
|
Tracking Without Bells and Whistles |
无铃无哨的追踪
|
|
Perspective-Guided Convolution Networks for Crowd Counting |
面向人群计数的透视导引卷积网络
|
|
End-to-End Wireframe Parsing |
端到端线框分析
|
|
Incremental Class Discovery for Semantic Segmentation With RGBD Sensing |
基于RGBD感知的增量类发现实现语义分割
|
|
SSF-DAN: Separated Semantic Feature Based Domain Adaptation Network for Semantic Segmentation |
SSF-DAN:基于分离语义特征的域自适应实现语义分割
|
(待标签的)训练样本与真实域无标签训练样本在不同域,因此采用域自适应的方法,来实现弱监督的语义分割。本文采用GAN的方法,如图2 |
SpaceNet MVOI: A Multi-View Overhead Imagery Dataset |
SpaceNet-MVOI:一个多视图俯视图像数据集
|
|
Multi-Level Bottom-Top and Top-Bottom Feature Fusion for Crowd Counting |
用于人群计数的多层次自下而上和自上而下特征融合
|
|
Learning Lightweight Lane Detection CNNs by Self Attention Distillation |
自关注蒸馏学习轻量级CNNs用于车道检测
|
|
SplitNet: Sim2Sim and Task2Task Transfer for Embodied Visual Navigation |
SplitNet:Sim2Sim和Task2Task传输以实现可视化导航
|
|
Cascaded Parallel Filtering for Memory-Efficient Image-Based Localization |
基于级联并行滤波的记忆高效图像定位
|
|
Pixel2Mesh++: Multi-View 3D Mesh Generation via Deformation |
Pixel2Mesh++:通过变形生成多视图三维网格
|
|
A Differential Volumetric Approach to Multi-View Photometric Stereo |
基于差分体积法的多视光度立体成像
|
|
Revisiting Radial Distortion Absolute Pose |
重新审视径向畸变绝对姿态
|
|
Estimating the Fundamental Matrix Without Point Correspondences With Application to Transmission Imaging |
无点对应的基本矩阵估计及其在透射成像中的应用 |
|
QUARCH: A New Quasi-Affine Reconstruction Stratum From Vague Relative Camera Orientation Knowledge |
QUARCH:一种基于模糊相对摄像机方位知识的准仿射重建层
|
|
Homography From Two Orientation- and Scale-Covariant Features |
基于两个方向和尺度协方差特征的单应性
|
|
Hiding Video in Audio via Reversible Generative Models |
基于可逆生成模型的隐藏视频到音频
|
|
GSLAM: A General SLAM Framework and Benchmark |
GSLAM:一个通用的SLAM框架和基准
|
|
Elaborate Monocular Point and Line SLAM With Robust Initialization |
具有鲁棒初始化的精细单目点-线SLAM
|
|
Adaptive Density Map Generation for Crowd Counting |
用于人群计数的自适应密度图生成
|
|
Attention-Aware Polarity Sensitive Embedding for Affective Image Retrieval |
注意力感知极性敏感嵌入在情感图像检索中的应用 |
|
Zero-Shot Emotion Recognition via Affective Structural Embedding |
基于情感结构嵌入的零镜头情感识别
|
|
FW-GAN: Flow-Navigated Warping GAN for Video Virtual Try-On |
FW-GAN:用于视频虚拟试穿的流导航翘曲GAN
|
|
Interactive Sketch & Fill: Multiclass Sketch-to-Image Translation |
交互式草图与填充:多类别草图-图像转换
|
|
Attention-Based Autism Spectrum Disorder Screening With Privileged Modality |
基于注意力的自闭症谱系障碍筛查
|
|
Image Aesthetic Assessment Based on Pairwise Comparison A Unified Approach to Score Regression, Binary Classification, and Personalization |
基于成对比较的图像美学评价评分回归、二元分类和个性化的统一方法
|
|
Delving Into Robust Object Detection From Unmanned Aerial Vehicles: A Deep Nuisance Disentanglement Approach |
无人机鲁棒目标检测的深入研究 |
|
Bit-Flip Attack: Crushing Neural Network With Progressive Bit Search |
比特翻转攻击:基于渐进式比特搜索的粉碎神经网络
|
|
Pushing the Frontiers of Unconstrained Crowd Counting: New Dataset and Benchmark Method |
推动无约束人群计数的前沿:新数据集和基准方法
|
|
Employing Deep Part-Object Relationships for Salient Object Detection |
利用深度局部-目标关系进行显著目标检测
|
|
Self-Supervised Deep Depth Denoising |
自监督深度深度信息去噪
|
|
Cost-Aware Fine-Grained Recognition for IoTs Based on Sequential Fixations |
成本感知细粒度识别实现顺序固定的IoT(物联网?)
|
|
Layout-Induced Video Representation for Recognizing Agent-in-Place Actions |
基于布局诱导的视频表示方法识别Agent原地动作
|
|
Anomaly Detection in Video Sequence With Appearance-Motion Correspondence |
基于外观运动对应的视频序列异常检测
|
|
Exploring Randomly Wired Neural Networks for Image Recognition |
随机有线神经网络在图像识别中的应用 |
|
Progressive Differentiable Architecture Search: Bridging the Depth Gap Between Search and Evaluation |
渐进可微架构搜索:缩小搜索和评估之间的深度差距
|
|
Multinomial Distribution Learning for Effective Neural Architecture Search |
基于多项式分布学习的有效的神经结构搜索
|
|
Searching for MobileNetV3 |
正在搜索MobileNetV3
|
|
Data-Free Quantization Through Weight Equalization and Bias Correction |
通过权值均衡和偏差校正实现无数据量化
|
|
A Camera That CNNs: Towards Embedded Neural Networks on Pixel Processor Arrays |
CNNs摄像机:面向像素处理器阵列上的嵌入式神经网络
|
|
Knowledge Distillation via Route Constrained Optimization |
基于路径约束优化的知识蒸馏
|
|
Distillation-Based Training for Multi-Exit Architectures |
基于蒸馏的训练实现多出口结构
|
|
Similarity-Preserving Knowledge Distillation |
相似性保持的知识蒸馏
|
|
Many Task Learning With Task Routing |
基于任务路由的多任务学习
|
|
Stochastic Filter Groups for Multi-Task CNNs: Learning Specialist and Generalist Convolution Kernels |
基于随机滤波器组的多任务CNN:学习专家和广义卷积核
|
|
Transferability and Hardness of Supervised Classification Tasks |
监督分类任务的可转移性与难易性
|
|
Moment Matching for Multi-Source Domain Adaptation |
基于矩匹配的多源域自适应
|
|
Unsupervised Domain Adaptation via Regularized Conditional Alignment |
基于正则条件对齐的无监督域自适应
|
|
Larger Norm More Transferable: An Adaptive Feature Norm Approach for Unsupervised Domain Adaptation |
更大范数更多可转移:一种无监督域自适应的自适应特征范数方法
|
|
UM-Adapt: Unsupervised Multi-Task Adaptation Using Adversarial Cross-Task Distillation |
UM-Adapt:使用对抗性跨任务蒸馏的无监督多任务自适应
|
|
Episodic Training for Domain Generalization |
基于幕式训练的域泛化
|
|
Domain Adaptation for Structured Output via Discriminative Patch Representations |
基于判别区分块表示的结构化输出域自适应
|
|
Semi-Supervised Learning by Augmented Distribution Alignment |
基于增广分布对齐的半监督学习
|
|
S4L: Self-Supervised Semi-Supervised Learning |
S4L:自监督半监督学习
|
|
Privacy Preserving Image Queries for Camera Localization |
隐私保护图像查询实现摄像机定位
|
|
Calibration Wizard: A Guidance System for Camera Calibration Based on Modelling Geometric and Corner Uncertainty |
标定向导:一种基于几何和角不确定性建模的摄像机标定制导系统
|
|
Gated2Depth: Real-Time Dense Lidar From Gated Images |
Gated2Depth:来自门控图像的实时密集激光雷达
|
|
X-Section: Cross-Section Prediction for Enhanced RGB-D Fusion |
x截面:增强RGBD融合的截面预测
|
|
Learning an Event Sequence Embedding for Dense Event-Based Deep Stereo |
事件序列嵌入学习实现基于稠密事件的深度立体图
|
|
Point-Based Multi-View Stereo Network |
基于点的多视图立体网络
|
|
Discrete Laplace Operator Estimation for Dynamic 3D Reconstruction |
动态三维重建的离散Laplace算子估计
|
|
Deep Non-Rigid Structure From Motion |
深度非刚性Structure From Motion
|
|
Equivariant Multi-View Networks |
等变多视网络 |
|
Interpolated Convolutional Networks for 3D Point Cloud Understanding |
插值卷积网络在三维点云理解中的应用 |
|
Revisiting Point Cloud Classification: A New Benchmark Dataset and Classification Model on Real-World Data |
重新审视点云分类:一种基于真实数据的新的基准数据集和分类模型
|
|
PointCloud Saliency Maps |
点云显著图
|
|
ShellNet: Efficient Point Cloud Convolutional Neural Networks Using Concentric Shells Statistics |
基于同心壳统计的高效点云卷积神经网络
|
|
Unsupervised Deep Learning for Structured Shape Matching |
基于无监督深度学习的结构形状匹配
|
|
Linearly Converging Quasi Branch and Bound Algorithms for Global Rigid Registration |
基于线性收敛准分枝和定界算法的全局刚性配准
|
|
Consensus Maximization Tree Search Revisited |
协商一致最大化树搜索
|
|
Quasi-Globally Optimal and Efficient Vanishing Point Estimation in Manhattan World |
曼哈顿世界的准全局最优高效消失点估计
|
|
An Efficient Solution to the Homography-Based Relative Pose Problem With a Common Reference Direction |
具有共同参考方向的单应相对位姿问题的有效解
|
|
A Quaternion-Based Certifiably Optimal Solution to the Wahba Problem With Outliers |
基于四元数的孤立点Wahba问题的可证明最优解
|
|
PLMP - Point-Line Minimal Problems in Complete Multi-View Visibility |
完全多视图可见性中的点-线最小问题
|
|
Variational Few-Shot Learning |
变分少镜头学习
|
|
Generative Adversarial Minority Oversampling |
生成性对抗少数过采样
|
|
Memorizing Normality to Detect Anomaly: Memory-Augmented Deep Autoencoder for Unsupervised Anomaly Detection |
记忆正态性异常检测:用于无监督异常检测的记忆增强深度自动编码器
|
|
Topological Map Extraction From Overhead Images |
从头顶图像中提取拓扑图
|
|
Exploiting Temporal Consistency for Real-Time Video Depth Estimation |
利用时间一致性进行实时视频深度估计
|
|
The Sound of Motions |
运动的声音 |
|
SC-FEGAN: Face Editing Generative Adversarial Network With User's Sketch and Color |
SC-FEGAN:基于用户素描和色彩的人脸编辑生成对抗网络
|
|
Exploring Overall Contextual Information for Image Captioning in Human-Like Cognitive Style |
探索类人认知方式中图像字幕的整体语境信息
|
|
Order-Aware Generative Modeling Using the 3D-Craft Dataset |
基于三维工艺数据集的次序感知生成建模
|
|
Crowd Counting With Deep Structured Scale Integration Network |
基于深度结构规模集成网络的人群计数
|
|
Bidirectional One-Shot Unsupervised Domain Mapping |
双向单镜头无监督域映射
|
|
Evolving Space-Time Neural Architectures for Videos |
进化的视频时空神经结构
|
|
Universally Slimmable Networks and Improved Training Techniques |
通用可瘦身网络和改进的训练技术
|
|
AutoDispNet: Improving Disparity Estimation With AutoML |
AutoDispNet:用AutoML改进视差估计
|
网络结构搜索和最优超参数搜索的方法 |
Deep Meta Functionals for Shape Representation |
基于深度元函数的形状表示
|
|
Differentiable Kernel Evolution |
可微的核演化
|
|
Batch Weight for Domain Adaptation With Mass Shift |
利用质量漂移实现域自适应的批处权重
|
|
SRM: A Style-Based Recalibration Module for Convolutional Neural Networks |
SRM:卷积神经网络中一种基于样式的再校准模块
|
|
Switchable Whitening for Deep Representation Learning |
基于可切换白化的深度表示学习
|
|
Adaptative Inference Cost With Convolutional Neural Mixture Models |
基于卷积神经混合模型的自适应推理代价
|
|
On Network Design Spaces for Visual Recognition |
基于网络设计空间的视觉识别
|
|
Improved Techniques for Training Adaptive Deep Networks |
自适应深度网络训练的改进技术
|
|
Resource Constrained Neural Network Architecture Search: Will a Submodularity Assumption Help? |
资源受限的神经网络架构搜索:子模块假设有帮助吗? |
|
ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmetric Convolution Blocks |
ACNet:通过非对称卷积块增强CNN的核骨架
|
|
A Comprehensive Overhaul of Feature Distillation |
特征蒸馏的全面检修
|
|
Transferable Semi-Supervised 3D Object Detection From RGB-D Data |
RGBD数据的可转移半监督三维目标检测
|
|
DPOD: 6D Pose Object Detector and Refiner |
DPOD:6D位姿目标检测器和细化器
|
|
STD: Sparse-to-Dense 3D Object Detector for Point Cloud |
STD:点云的稀疏-稠密三维目标检测器
|
|
DUP-Net: Denoiser and Upsampler Network for 3D Adversarial Point Clouds Defense |
DUP-Net:用于3D对抗点云防御的去噪和上采样网络
|
|
Learning Rich Features at High-Speed for Single-Shot Object Detection |
高速学习丰富特征实现单镜头目标检测
|
|
Detecting Unseen Visual Relations Using Analogies |
用类比法检测看不见的视觉关系
|
|
Disentangling Monocular 3D Object Detection |
分离式单目三维目标检测
|
|
STM: SpatioTemporal and Motion Encoding for Action Recognition |
STM:用于动作识别的时空和运动编码
|
|
Dynamic Context Correspondence Network for Semantic Alignment |
语义对齐的动态上下文对应网络
|
|
Fooling Network Interpretation in Image Classification |
图像分类中的愚弄网络解释
|
|
Unconstrained Foreground Object Search |
无约束前景对象搜索
|
|
Embodied Amodal Recognition: Learning to Move to Perceive Objects |
体现性情感识别:学习移动感知物体
|
|
SpatialSense: An Adversarially Crowdsourced Benchmark for Spatial Relation Recognition |
空间感知:一种用于空间关系识别的逆向众包基准
|
|
TensorMask: A Foundation for Dense Object Segmentation |
TensorMask:密集目标分割的基础
|
|
Integral Object Mining via Online Attention Accumulation |
基于在线注意力积累的整体对象挖掘
|
|
Accelerated Gravitational Point Set Alignment With Altered Physical Laws |
用改变的物理定律加速引力点集对准
|
|
Domain Adaptation for Semantic Segmentation With Maximum Squares Loss |
基于最大平方损失的域自适应实现语义分割
|
基于域自适应的语义分割,提出两点改进:1. 提出新的损失函数;2. 提出类别重加权,以解决类别不平衡的问题 |
Domain Randomization and Pyramid Consistency: Simulation-to-Real Generalization Without Accessing Target Domain Data |
域随机化与金字塔一致性:不访问目标域数据的真实综合仿真
|
|
Semi-Supervised Skin Detection by Network With Mutual Guidance |
基于互导网络的半监督皮肤检测
|
|
ACE: Adapting to Changing Environments for Semantic Segmentation |
ACE:适应不断变化的环境实现语义分割
|
基于域自适应的语义分割 |
Efficient Segmentation: Learning Downsampling Near Semantic Boundaries |
有效分割:在语义边界附近学习下采样
|
|
Recurrent U-Net for Resource-Constrained Segmentation |
基于递归U-Net的资源受限分割
|
|
Detecting the Unexpected via Image Resynthesis |
通过图像再合成检测意外
|
|
Self-Supervised Monocular Depth Hints |
自监督单目深度提示 |
|
3D Scene Reconstruction With Multi-Layer Depth and Epipolar Transformers |
基于多层深度和极线变换的三维场景重建
|
|
How Do Neural Networks See Depth in Single Images? |
神经网络如何在单个图像中看到深度? |
|
On Boosting Single-Frame 3D Human Pose Estimation via Monocular Videos |
单目视频增强单帧三维人体姿态估计
|
|
Canonical Surface Mapping via Geometric Cycle Consistency |
基于几何循环一致性的正则曲面映射
|
|
3D-RelNet: Joint Object and Relational Network for 3D Prediction |
3d RelNet:三维预测的联合对象和关系网络
|
|
GP2C: Geometric Projection Parameter Consensus for Joint 3D Pose and Focal Length Estimation in the Wild |
GP2C:基于几何投影参数一致性的野外联合三维姿态和焦距估计
|
|
Moulding Humans: Non-Parametric 3D Human Shape Estimation From Single Images |
塑造人:基于单个图像的非参数三维人体形状估计
|
|
3DPeople: Modeling the Geometry of Dressed Humans |
3DPeople:为穿着衣服的人的几何体建模
|
|
Learning to Reconstruct 3D Human Pose and Shape via Model-Fitting in the Loop |
基于模型拟合的三维人体姿态和形状重建
|
|
Optimizing Network Structure for 3D Human Pose Estimation |
三维人体姿态估计的网络结构优化
|
|
Exploiting Spatial-Temporal Relationships for 3D Pose Estimation via Graph Convolutional Networks |
基于时-空关系的图形卷积网络实现三维姿态估计
|
|
Resolving 3D Human Pose Ambiguities With 3D Scene Constraints |
利用三维场景约束解决三维人体姿态模糊问题
|
|
Tex2Shape: Detailed Full Human Body Geometry From a Single Image |
Tex2Shape:从一幅图像中获得详细的全身几何图形
|
|
PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization |
PIFu:基于像素对齐隐函数的高分辨率服装数字化
|
|
DF2Net: A Dense-Fine-Finer Network for Detailed 3D Face Reconstruction |
DF2Net:一种密集-精细-更精细网络实现详细三维人脸重建
|
|
Monocular 3D Human Pose Estimation by Generation and Ordinal Ranking |
基于生成和序数排序的单目三维人体姿态估计
|
|
Aligning Latent Spaces for 3D Hand Pose Estimation |
基于潜在空间对齐的三维手部姿态估计
|
|
HEMlets Pose: Learning Part-Centric Heatmap Triplets for Accurate 3D Human Pose Estimation |
HEMLets Pose:学习以局部为中心的热图三元组以精确估计三维人体姿势
|
|
End-to-End Hand Mesh Recovery From a Monocular RGB Image |
单目RGB图像的端到端手部网格恢复
|
|
Robust Multi-Modality Multi-Object Tracking |
鲁棒多模态多目标跟踪
|
|
The Trajectron: Probabilistic Multi-Agent Trajectory Modeling With Dynamic Spatiotemporal Graphs |
基于动态时空图的概率多智能体轨迹建模
|
|
'Skimming-Perusal' Tracking: A Framework for Real-Time and Robust Long-Term Tracking |
“略读”跟踪:一个实时和健壮的长期跟踪框架
|
|
TASED-Net: Temporally-Aggregating Spatial Encoder-Decoder Network for Video Saliency Detection |
用于视频显著性检测的时间聚集空间编解码网络
|
|
Attacking Optical Flow |
攻击光流 |
|
Pro-Cam SSfM: Projector-Camera System for Structure and Spectral Reflectance From Motion |
Pro-Cam SSfm:用于运动中结构和光谱反射的投影-摄像系统
|
|
Mop Moire Patterns Using MopNet |
基于MopNet的Mop Moire图案
|
|
Kernel Modeling Super-Resolution on Real Low-Resolution Images |
真实低分辨率图像的核模型超分辨率
|
|
Learning to Jointly Generate and Separate Reflections |
学会共同产生和分离反射
|
|
Deep Multi-Model Fusion for Single-Image Dehazing |
基于深度多模型融合的单图像去雾
|
|
Deep Learning for Seeing Through Window With Raindrops |
透过雨滴看窗外的深度学习
|
|
Mask-ShadowGAN: Learning to Remove Shadows From Unpaired Data |
Mask-ShadowGAN:学习从未配对数据中移除阴影
|
|
Spatio-Temporal Filter Adaptive Network for Video Deblurring |
用于视频去模糊的时空滤波自适应网络
|
|
Learning Deep Priors for Image Dehazing |
图像去模糊的深度先验学习
|
|
JPEG Artifacts Reduction via Deep Convolutional Sparse Coding |
基于深度卷积稀疏编码的jpeg伪影抑制
|
|
Self-Guided Network for Fast Image Denoising |
用于快速图像去噪的自引导网络
|
|
Non-Local Intrinsic Decomposition With Near-Infrared Priors |
基于近红外先验的非局部本征分解
|
|
VideoMem: Constructing, Analyzing, Predicting Short-Term and Long-Term Video Memorability |
VideoMem:构建、分析、预测短期和长期视频记忆
|
|
Rescan: Inductive Instance Segmentation for Indoor RGBD Scans |
Rescan:基于归纳实例分割的室内RGBD扫描
|
|
End-to-End CAD Model Retrieval and 9DoF Alignment in 3D Scans |
三维扫描中的端到端CAD模型检索与9自由度对准
|
|
Making History Matter: History-Advantage Sequence Training for Visual Dialog |
创造历史:基于历史优势序列训练的可视化对话
|
|
Stochastic Attraction-Repulsion Embedding for Large Scale Image Localization |
随机吸引-排斥嵌入在大规模图像定位中的应用 |
|
Scene Graph Prediction With Limited Labels |
基于有限标签的场景图预测
|
|
Taking a HINT: Leveraging Explanations to Make Vision and Language Models More Grounded |
提示:利用解释使视觉和语言模型更加扎根
|
|
Align2Ground: Weakly Supervised Phrase Grounding Guided by Image-Caption Alignment |
Align2Ground:图片-标注对齐引导的弱监督phase grounding
|
phrase grounding:给出一张图片和一个自然语言描述的问题,在图片中定位问题中所提到的物体。是很多问题的 |
Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding |
基于自适应重构网络的弱监督指代表达
|
|
Hierarchy Parsing for Image Captioning |
基于层次分析的图像标注
|
|
HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips |
HowTo100M:通过观看一亿个叙述视频片段实现文本-视频嵌入学习
|
|
Controllable Video Captioning With POS Sequence Guidance Based on Gated Fusion Network |
基于门控融合网络的POS序列引导实现可控视频标注
|
|
Multi-View Stereo by Temporal Nonparametric Fusion |
基于时间非参数融合的多视点立体视觉
|
|
Floor-SP: Inverse CAD for Floorplans by Sequential Room-Wise Shortest Path |
Floor-SP:按顺序房间最短路径进行楼层平面逆向CAD
|
|
Polarimetric Relative Pose Estimation |
极化相对位姿估计
|
|
Closed-Form Optimal Two-View Triangulation Based on Angular Errors |
基于角度误差的闭式最优二视图三角剖分
|
|
Pix2Vox: Context-Aware 3D Reconstruction From Single and Multi-View Images |
Pix2Vox:基于单视图和多视图图像的上下文感知三维重建
|
|
Unsupervised Robust Disentangling of Latent Characteristics for Image Synthesis |
潜在特征的无监督鲁棒分离实现图像合成
|
|
SROBB: Targeted Perceptual Loss for Single Image Super-Resolution |
SROBB:单图像超分辨率的目标感知损失
|
|
An Internal Learning Approach to Video Inpainting |
视频修复的内部学习方法
|
|
Deep CG2Real: Synthetic-to-Real Translation via Image Disentanglement |
深层CG2Real:通过图像解纠缠实现从合成到真实的翻译
|
|
Adversarial Defense via Learning to Generate Diverse Attacks |
通过学习产生多种攻击实现对抗性防御
|
|
Image Generation From Small Datasets via Batch Statistics Adaptation |
批统计自适应实现从小数据集生成图像
|
|
Lifelong GAN: Continual Learning for Conditional Image Generation |
终身GAN:条件图像生成的持续学习
|
|
Bayesian Relational Memory for Semantic Visual Navigation |
面向语义视觉导航的贝叶斯关系记忆
|
|
Mono-SF: Multi-View Geometry Meets Single-View Depth for Monocular Scene Flow Estimation of Dynamic Traffic Scenes |
Mono-SF:多视点几何满足单视点深度的单目动态交通场景流量估计
|
|
Prior Guided Dropout for Robust Visual Localization in Dynamic Environments |
基于先验引导Dropout的动态环境下鲁棒视觉定位
|
|
Drive&Act: A Multi-Modal Dataset for Fine-Grained Driver Behavior Recognition in Autonomous Vehicles |
Drive&Act:一个用于自主车辆细粒度驾驶员行为识别的多模态数据集
|
|
Depth Completion From Sparse LiDAR Data With Depth-Normal Constraints |
基于深度法向约束的稀疏激光雷达数据深度补全
|
|
PRECOG: PREdiction Conditioned on Goals in Visual Multi-Agent Settings |
PRECOG:视觉多Agent设置中基于目标的预测
|
|
LPD-Net: 3D Point Cloud Learning for Large-Scale Place Recognition and Environment Analysis |
LPD-Net:用于大规模地点识别和环境分析的三维点云学习
|
|
Local Supports Global: Deep Camera Relocalization With Sequence Enhancement |
局部支持全局:基于序列增强的深度相机重定位
|
|
Sequential Adversarial Learning for Self-Supervised Deep Visual Odometry |
基于序贯对抗学习的自监督深度视觉里程计
|
|
TextPlace: Visual Place Recognition and Topological Localization Through Reading Scene Texts |
文本位置:通过阅读场景文本进行视觉位置识别和拓扑定位
|
|
CamNet: Coarse-to-Fine Retrieval for Camera Re-Localization |
CamNet:从粗到细的检索实现相机重定位
|
|
Situational Fusion of Visual Representation for Visual Navigation |
视觉表示的情景融合实现视觉导航
|
|
Learning Aberrance Repressed Correlation Filters for Real-Time UAV Tracking |
学习畸变抑制相关滤波器在无人机实时跟踪中的应用 |
|
6-DOF GraspNet: Variational Grasp Generation for Object Manipulation |
六自由度GraspNet:基于变分抓取生成的对象操作
|
|
DAGMapper: Learning to Map by Discovering Lane Topology |
DAGMapper:通过发现车道拓扑学习地图
|
|
3D-LaneNet: End-to-End 3D Multiple Lane Detection |
3D-LaneNet:端到端三维多车道检测
|
|
Sampling-Free Epistemic Uncertainty Estimation Using Approximated Variance Propagation |
基于近似方差传播的无抽样认知不确定性估计
|
|
Universal Adversarial Perturbation via Prior Driven Uncertainty Approximation |
基于先验驱动不确定近似的普遍反对称扰动
|
|
Understanding Deep Networks via Extremal Perturbations and Smooth Masks |
利用极值扰动和光滑掩模理解深度网络
|
|
Unsupervised Pre-Training of Image Features on Non-Curated Data |
非精确数据上图像特征的无监督预训练
|
|
Learning Local Descriptors With a CDF-Based Dynamic Soft Margin |
基于CDF的动态软边值实现局部描述子学习
|
|
Bayes-Factor-VAE: Hierarchical Bayesian Deep Auto-Encoder Models for Factor Disentanglement |
Bayes-Factor-VAE:用于因子分离的分层Bayesian深度自编码模型
|
|
Linearized Multi-Sampling for Differentiable Image Transformation |
基于线性化多重采样的可微图像变换
|
|
AdaTransform: Adaptive Data Transformation |
AdaTransform:自适应数据转换
|
|
CARAFE: Content-Aware ReAssembly of FEatures |
CARAFE:内容感知的特征重组
|
用于上采样的一种改进算法(如图2):分两步,首先训练出一个用于不同位置点乘的核(不同于双线性,不同位置的处理方式依赖于这个核);然后利用这个核来进行局部邻域的加权均值,从而实现不同位置,不同处理方式的上采样 |
AFD-Net: Aggregated Feature Difference Learning for Cross-Spectral Image Patch Matching |
AFD-Net:用于跨光谱图像块匹配的聚合特征差分学习
|
|
Deep Joint-Semantics Reconstructing Hashing for Large-Scale Unsupervised Cross-Modal Retrieval |
面向大规模无监督跨模态检索的深度联合语义重构哈希算法
|
|
Unsupervised Neural Quantization for Compressed-Domain Similarity Search |
基于无监督神经量化的压缩域相似性搜索
|
|
Siamese Networks: The Tale of Two Manifolds |
孪生网络:两个流形的故事 |
|
Learning Combinatorial Embedding Networks for Deep Graph Matching |
用于深度图匹配的组合嵌入网络学习
|
|
Fashion Retrieval via Graph Reasoning Networks on a Similarity Pyramid |
基于相似金字塔的图推理网络实现服装检索
|
|
Wavelet Domain Style Transfer for an Effective Perception-Distortion Tradeoff in Single Image Super-Resolution |
单图像超分辨率中基于小波域风格变换的感知失真折衷
|
|
Toward Real-World Single Image Super-Resolution: A New Benchmark and a New Model |
走向现实世界的单图像超分辨率:一种新的基准和模型
|
|
RankSRGAN: Generative Adversarial Networks With Ranker for Image Super-Resolution |
RankSRGAN:基于Ranker的GAN实现图像超分辨率
|
|
Progressive Fusion Video Super-Resolution Network via Exploiting Non-Local Spatio-Temporal Correlations |
利用非局部时空相关性的渐进式融合实现视频超分辨率网络
|
|
Deep SR-ITM: Joint Learning of Super-Resolution and Inverse Tone-Mapping for 4K UHD HDR Applications |
深度SR-ITM:4K超高清应用中超分辨率和逆色调映射的联合学习
|
|
Dynamic PET Image Reconstruction Using Nonnegative Matrix Factorization Incorporated With Deep Image Prior |
非负矩阵分解结合深度图像先验的动态PET图像重建
|
|
DSIC: Deep Stereo Image Compression |
深度立体图像压缩 |
|
Variable Rate Deep Image Compression With a Conditional Autoencoder |
基于条件自动编码器的变速率深度图像压缩
|
|
Real Image Denoising With Feature Attention |
基于特征注意的真实图像去噪
|
|
Noise Flow: Noise Modeling With Conditional Normalizing Flows |
噪声流:使用条件规范化流的噪声建模
|
|
Bottleneck Potentials in Markov Random Fields |
马尔可夫随机场的瓶颈势
|
|
Seeing Motion in the Dark |
在黑暗中看运动
|
|
SENSE: A Shared Encoder Network for Scene-Flow Estimation |
SENSE:用于场景流估计的共享编码器网络
|
|
Adversarial Feedback Loop |
对抗性反馈回路
|
|
Dynamic-Net: Tuning the Objective Without Re-Training for Synthesis Tasks |
动态网:无需重新训练即可调整目标实现综合任务
|
|
AutoGAN: Neural Architecture Search for Generative Adversarial Networks |
AutoGAN:生成性对抗网络的神经结构搜索
|
|
Co-Evolutionary Compression for Unpaired Image Translation |
基于协同进化压缩的非成对图像翻译
|
|
Self-Supervised Representation Learning From Multi-Domain Data |
多域数据的自监督表示学习
|
|
Controlling Neural Networks via Energy Dissipation |
基于能量耗散的神经网络控制
|
|
Indices Matter: Learning to Index for Deep Image Matting |
索引的重要性:学习索引进行深度图像抠图
|
|
LAP-Net: Level-Aware Progressive Network for Image Dehazing |
LAP-Net:基于层级感知递进网络的图像去雾
|
|
Attention Augmented Convolutional Networks |
注意力增强卷积网络
|
|
MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning |
元剪枝:神经网络通道自动剪枝的元学习
|
|
Accelerate CNN via Recursive Bayesian Pruning |
通过递归贝叶斯剪枝实现加速CNN
|
|
HBONet: Harmonious Bottleneck on Two Orthogonal Dimensions |
HBONet:两个正交维度上的和谐瓶颈
|
|
O2U-Net: A Simple Noisy Label Detection Approach for Deep Neural Networks |
O2U-Net:一种简单的深度神经网络中噪声标签检测方法
|
|
Continual Learning by Asymmetric Loss Approximation With Single-Side Overestimation |
基于单侧高估的非对称损失逼近实现连续学习
|
|
Label-PEnet: Sequential Label Propagation and Enhancement Networks for Weakly Supervised Instance Segmentation |
Label-PEnet:基于序列标签传播与增强网络的弱监督实例分割
|
|
LIP: Local Importance-Based Pooling |
LIP:局部基于重要性的池化
|
|
Global Feature Guided Local Pooling |
全局功能引导的局部池化
|
|
Conditional Coupled Generative Adversarial Networks for Zero-Shot Domain Adaptation |
基于条件耦合GAN的零镜头域自适应
|
|
Adversarial Defense by Restricting the Hidden Space of Deep Neural Networks |
通过限制深层神经网络隐藏空间实现对抗防御
|
|
Hyperpixel Flow: Semantic Correspondence With Multi-Layer Neural Features |
超像素流:基于多层神经特征的语义对应
|
|
Information Entropy Based Feature Pooling for Convolutional Neural Networks |
基于信息熵的卷积神经网络特征池
|
|
Patchwork: A Patch-Wise Attention Network for Efficient Object Detection and Segmentation in Video Streams |
PatchWork:一种用于视频流中有效目标检测和分割的补丁式注意力网络
|
|
AttentionRNN: A Structured Spatial Attention Mechanism |
AttentionRNN:一种结构化的空间注意机制
|
像RNN一样的Attention,即在估计Attention Mask时,每个点都依赖于前面已估计出的点(传统的方式是,每个点独立估计) |
Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks With Octave Convolution |
降八度:用八度卷积减少卷积神经网络的空间冗余
|
|
Domain Intersection and Domain Difference |
域交与域差
|
|
Learned Video Compression |
学习视频压缩
|
|
Local Relation Networks for Image Recognition |
基于局部关系网络的图像识别
|
|
DiscoNet: Shapes Learning on Disconnected Manifolds for 3D Editing |
DiscoNect:断开流形上的形状学习实现三维编辑
|
|
Deep Residual Learning in the JPEG Transform Domain |
JPEG变换域的深度残差学习
|
|
Approximated Bilinear Modules for Temporal Modeling |
基于近似双线性模型的时域建模
|
|
Customizing Student Networks From Heterogeneous Teachers via Adaptive Knowledge Amalgamation |
自适应知识融合实现从异构教师网络定制学生网络
|
|
Data-Free Learning of Student Networks |
学生网络的无数据学习
|
|
Deep Closest Point: Learning Representations for Point Cloud Registration |
深度最近点:基于表示学习的点云配准
|
|
Orientation-Aware Semantic Segmentation on Icosahedron Spheres |
二十面体球面上的方向感知语义分割
|
全方向(omnidirectional)图像的语义分割 |
Differentiable Learning-to-Group Channels via Groupable Convolutional Neural Networks |
基于可分组卷积神经网络的信道群可微学习
|
|
HarDNet: A Low Memory Traffic Network |
HarDNet:一个低内存交通网络
|
|
Dynamic Multi-Scale Filters for Semantic Segmentation |
用于语义分割的动态多尺度滤波器
|
如图2,网络中添加多个个基于自适应池化学习出来的滤波器 |
Online Model Distillation for Efficient Video Inference |
基于在线模型蒸馏的有效视频推理
|
|
Rethinking Zero-Shot Learning: A Conditional Visual Classification Perspective |
从条件视觉分类的角度反思零镜头学习
|
|
Task-Driven Modular Networks for Zero-Shot Compositional Learning |
基于任务驱动模块化网络的零镜头组合学习
|
|
Transductive Episodic-Wise Adaptive Metric for Few-Shot Learning |
基于转导不定自适应度量的少数镜头学习
|
|
Deep Multiple-Attribute-Perceived Network for Real-World Texture Recognition |
用于真实纹理识别的深度多属性感知网络
|
|
RGB-Infrared Cross-Modality Person Re-Identification via Joint Pixel and Feature Alignment |
基于联合像素和特征对齐的RGB-红外交叉模态人再识别
|
|
EvalNorm: Estimating Batch Normalization Statistics for Evaluation |
EvalNorm:估计用于评估的批处理规范化(BN)统计信息
|
|
Beyond Human Parts: Dual Part-Aligned Representations for Person Re-Identification |
超越人的部分:基于双部分对齐表示的人再识别
|
|
Person Search by Text Attribute Query As Zero-Shot Learning |
基于作为零镜头学习的文本属性查询的人搜索算法
|
|
Semantic-Aware Knowledge Preservation for Zero-Shot Sketch-Based Image Retrieval |
语义感知知识保存实现零镜头基于草图的图像检索
|
|
Active Learning for Deep Detection Neural Networks |
主动学习实现深度检测神经网络
|
|
One-Shot Neural Architecture Search via Self-Evaluated Template Network |
基于自评估模板网络的一次性神经网络结构搜索
|
|
Batch DropBlock Network for Person Re-Identification and Beyond |
用于人再识别及其他的批处理DropBlock网络
|
|
Omni-Scale Feature Learning for Person Re-Identification |
全尺度特征学习用于人再识别
|
|
Be Your Own Teacher: Improve the Performance of Convolutional Neural Networks via Self Distillation |
做自己的老师:通过自蒸馏提高卷积神经网络的性能
|
|
Diversity With Cooperation: Ensemble Methods for Few-Shot Classification |
合作分集:用于少镜头分类的集成方法 |
|
Enhancing 2D Representation via Adjacent Views for 3D Shape Retrieval |
基于邻接视图的二维图形增强实现三维形状检索
|
|
Adversarial Fine-Grained Composition Learning for Unseen Attribute-Object Recognition |
对抗性细粒度合成学习在不可见属性-对象识别中的应用 |
|
Auto-ReID: Searching for a Part-Aware ConvNet for Person Re-Identification |
Auto-ReID:搜索局部感知ConvNet实现人重识别
|
|
Second-Order Non-Local Attention Networks for Person Re-Identification |
二阶非局部注意网络用于人再识别
|
|
Fast Computation of Content-Sensitive Superpixels and Supervoxels Using Q-Distances |
用Q-距离快速计算内容敏感超像素和超体素
|
|
Progressive-X: Efficient, Anytime, Multi-Model Fitting Algorithm |
Progressive-X:高效、随时、多模型拟合算法
|
|
Structured Modeling of Joint Deep Feature and Prediction Refinement for Salient Object Detection |
联合深度特征和预测细化的结构化建模实现显著目标检测
|
|
Selectivity or Invariance: Boundary-Aware Salient Object Detection |
选择性或不变性:边界感知显著目标检测
|
|
Online Unsupervised Learning of the 3D Kinematic Structure of Arbitrary Rigid Bodies |
任意刚体三维运动结构的在线无监督学习
|
|
Few-Shot Generalization for Single-Image 3D Reconstruction via Priors |
利用少镜头泛化实现基于先验的单幅图像三维重建
|
|
Digging Into Self-Supervised Monocular Depth Estimation |
自监督单目深度估计方法的研究 |
|
Learning Object-Specific Distance From a Monocular Image |
从单目图像中学习特定对象的距离
|
|
Unsupervised 3D Reconstruction Networks |
无监督三维重建网络 |
|
3D Point Cloud Generative Adversarial Network Based on Tree Structured Graph Convolutions |
基于树结构图卷积的三维点云GAN
|
|
Visualization of Convolutional Neural Networks for Monocular Depth Estimation |
卷积神经网络可视化在单目深度估计中的应用 |
|
Co-Separating Sounds of Visual Objects |
视觉对象的共分离声音
|
|
BMN: Boundary-Matching Network for Temporal Action Proposal Generation |
BMN:基于边界匹配网络的时间行为建议生成
|
|
Weakly Supervised Temporal Action Localization Through Contrast Based Evaluation Networks |
基于对比度评价网络的弱监督时间行为定位
|
|
Progressive Sparse Local Attention for Video Object Detection |
基于渐进稀疏局部注意的视频目标检测
|
|
Reasoning About Human-Object Interactions Through Dual Attention Networks |
基于双注意网络的人机交互推理
|
|
DMM-Net: Differentiable Mask-Matching Network for Video Object Segmentation |
DMM-Net:用于视频对象分割的可微掩模匹配网络
|
|
Asymmetric Cross-Guided Attention Network for Actor and Action Video Segmentation From Natural Language Query |
非对称交叉引导注意网络实现自然语言查询中角色和动作视频分割
|
|
AGSS-VOS: Attention Guided Single-Shot Video Object Segmentation |
AGSS-VOS:注意力引导的单镜头视频对象分割
|
|
Global-Local Temporal Representations for Video Person Re-Identification |
基于全局-局部时间表示的视频人再识别
|
|
AdvIT: Adversarial Frames Identifier Based on Temporal Consistency in Videos |
ADvIT:基于时间一致性的视频对抗帧标识符
|
|
RANet: Ranking Attention Network for Fast Video Object Segmentation |
RANet:用于视频对象快速分割的排序注意网络
|
|
Spatial-Temporal Relation Networks for Multi-Object Tracking |
用于多目标跟踪的时空关系网络
|
|
Bridging the Gap Between Detection and Tracking: A Unified Approach |
缩小检测和跟踪之间的差距:一种统一的方法
|
|
Learning the Model Update for Siamese Trackers |
学习孪生跟踪器的模型更新
|
|
Fast-deepKCF Without Boundary Effect |
无边界效应的快速深度KCF
|
|
Program-Guided Image Manipulators |
程序引导图像操纵器
|
|
Calibration of Axial Fisheye Cameras Through Generic Virtual Central Models |
通用虚拟中心模型对鱼眼相机的标定
|
|
Micro-Baseline Structured Light |
微基线结构光
|
|
l-Net: Reconstruct Hyperspectral Images From a Snapshot Measurement |
l-Net:从快照测量重建高光谱图像
|
|
Deep Depth From Aberration Map |
像差图深度
|
|
A Dataset of Multi-Illumination Images in the Wild |
野外多光照图像数据集
|
|
Monocular Neural Image Based Rendering With Continuous View Control |
利用连续视图控制实现基于单目神经图像的展示
|
|
Multi-View Image Fusion |
多视点图像融合
|
|
Enhancing Low Light Videos by Exploring High Sensitivity Camera Noise |
利用高灵敏度相机噪声实现微光视频增强
|
|
Deep Restoration of Vintage Photographs From Scanned Halftone Prints |
从扫描的半色调照片中深度复原复古照片
|
|
Context-Aware Image Matting for Simultaneous Foreground and Alpha Estimation |
上下文感知图像抠图实现同时进行前景和α估计
|
|
CFSNet: Toward a Controllable Feature Space for Image Restoration |
CFSNet:基于可控特征空间的图像复原
|
|
Deep Blind Hyperspectral Image Fusion |
深度盲高光谱图像融合
|
|
Fully Convolutional Pixel Adaptive Image Denoiser |
全卷积像素自适应图像去噪
|
|
Coherent Semantic Attention for Image Inpainting |
基于连贯语义注意的图像修补
|
|
Embedded Block Residual Network: A Recursive Restoration Model for Single-Image Super-Resolution |
嵌入块残差网络:一种单图像超分辨率的递归恢复模型
|
|
Fast Image Restoration With Multi-Bin Trainable Linear Units |
基于Multi-Bin可训练线性单元的快速图像复原
|
|
Counting With Focus for Free |
免费焦点计数
|
|
SynDeMo: Synergistic Deep Feature Alignment for Joint Learning of Depth and Ego-Motion |
SynDeMo:基于协同深度特征对齐的深度和自我运动联合学习
|
|
Diverse Image Synthesis From Semantic Layouts via Conditional IMLE |
基于条件IMLE的语义布局多样性图像合成
|
|
Towards Bridging Semantic Gap to Improve Semantic Segmentation |
通过桥接语义鸿沟实现语义分割改进
|
文章关注不同尺度特征的融合问题,在图6的网络结构中,使用了图4的三个模块,主要从多尺度融合和边缘感知两个方向,提升语义分割的效果 |
Generating Diverse and Descriptive Image Captions Using Visual Paraphrases |
使用视觉释义生成多样的描述性图片标注
|
|
Learning to Collocate Neural Modules for Image Captioning |
基于神经模块配置学习的图像标注
|
|
Sequential Latent Spaces for Modeling the Intention During Diverse Image Captioning |
序列潜空间在多样图像标注中的意图建模
|
|
Why Does a Visual Question Have Different Answers? |
为什么视觉问题有不同的答案? |
|
G3raphGround: Graph-Based Language Grounding |
G3raphGround:基于图形的语言Grounding
|
|
Scene Text Visual Question Answering |
场景文本可视化问答
|
|
Unsupervised Collaborative Learning of Keyframe Detection and Visual Odometry Towards Monocular Deep SLAM |
关键帧检测与视觉里程测量的无监督协同学习实现单目深度SLAM
|
|
MVSCRF: Learning Multi-View Stereo With Conditional Random Fields |
MVSCRF:基于条件随机场的多视图立体学习
|
|
Neural-Guided RANSAC: Learning Where to Sample Model Hypotheses |
神经引导的RANSAC:模型假设采样位置学习
|
|
Efficient Learning on Point Clouds With Basis Point Sets |
基于基础点集的点云高效学习
|
|
Cross View Fusion for 3D Human Pose Estimation |
基于交叉视图融合的三维人体姿态估计
|
|
Shape-Aware Human Pose and Shape Reconstruction Using Multi-View Images |
基于多视点图像的形状感知人体姿态与形状重建
|
|
Monocular Piecewise Depth Estimation in Dynamic Scenes by Exploiting Superpixel Relations |
基于超像素关系的动态场景单目分段深度估计
|
|
Is This the Right Place? Geometric-Semantic Pose Verification for Indoor Visual Localization |
这是对的地方吗?基于几何-语义位姿验证的室内视觉定位
|
|
DeepPruner: Learning Efficient Stereo Matching via Differentiable PatchMatch |
DeepPruner:通过可微Patch匹配实现有效的立体匹配学习
|
1. 利用RNN的结构,描述PatchMatch 2. 利用可微的PatchMatch,缩小每个像素视差的搜索范围(传统的方法是所有视差可能性,而文中每个像素考虑的是部分视差,即Confidence Range,大约是全部视差范围的1/10) |
Convolutional Sequence Generation for Skeleton-Based Action Synthesis |
利用卷积序列生成实现基于骨架的动作合成
|
|
Onion-Peel Networks for Deep Video Completion |
Onion-Peel网络用于深度视频补全
|
|
Copy-and-Paste Networks for Deep Video Inpainting |
基于复制-粘贴网络的深度视频修补
|
|
Content and Style Disentanglement for Artistic Style Transfer |
基于内容与风格解构的艺术风格转换
|
|
Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space? |
Image2StyleGAN:如何将图像嵌入StyleGAN潜在空间? |
|
Controllable Artistic Text Style Transfer via Shape-Matching GAN |
基于形状-匹配GAN的可控艺术文本风格转换
|
|
Understanding Generalized Whitening and Coloring Transform for Universal Style Transfer |
广义白化与着色变换在通用风格转换中的应用 |
|
Learning Implicit Generative Models by Matching Perceptual Features |
基于感知特征匹配的隐生成模型学习
|
|
Free-Form Image Inpainting With Gated Convolution |
基于门控卷积的自由形式图像补全
|
|
FiNet: Compatible and Diverse Fashion Image Inpainting |
FiNet:兼容的和多样的时尚形象修复
|
|
InGAN: Capturing and Retargeting the "DNA" of a Natural Image |
InGAN:捕捉并重新定位自然图像的“DNA”
|
|
Seeing What a GAN Cannot Generate |
看一个GAN不能产生什么 |
|
COCO-GAN: Generation by Parts via Conditional Coordinating |
COCO-GAN:基于条件配位的分块生成
|
|
Neural Turtle Graphics for Modeling City Road Layouts |
基于神经海龟图形建模的城市道路规划
|
|
Texture Fields: Learning Texture Representations in Function Space |
纹理场:在函数空间中学习纹理表示
|
|
PointFlow: 3D Point Cloud Generation With Continuous Normalizing Flows |
PointFlow:基于连续规格化流的三维点云生成
|
|
Meta-Sim: Learning to Generate Synthetic Datasets |
Meta-Sim:学习生成合成数据集
|
|
Specifying Object Attributes and Relations in Interactive Scene Generation |
在交互式场景生成中指定对象属性和关系
|
|
SinGAN: Learning a Generative Model From a Single Natural Image |
SinGAN:从单一自然图像学习生成模型
|
|
VaTeX: A Large-Scale, High-Quality Multilingual Dataset for Video-and-Language Research |
VaTex:一个用于视频和语言研究的大规模、高质量的多语言数据集
|
|
A Graph-Based Framework to Bridge Movies and Synopses |
一种基于图的框架实现电影与剧情的桥接
|
|
From Strings to Things: Knowledge-Enabled VQA Model That Can Read and Reason |
从字符串到事物:可以读取和推理的支持知识的VQA模型
|
|
Counterfactual Critic Multi-Agent Training for Scene Graph Generation |
用于场景图生成的反事实批评家多智能体训练
|
|
Robust Change Captioning |
强大的更改字幕
|
|
Attention on Attention for Image Captioning |
|
|
Dynamic Graph Attention for Referring Expression Comprehension |
动态图形注意力在指称表达理解中的应用 |
|
Visual Semantic Reasoning for Image-Text Matching |
基于视觉语义推理的图-文匹配
|
|
Phrase Localization Without Paired Training Examples |
无配对训练实例的短语定位
|
|
Learning to Assemble Neural Module Tree Networks for Visual Grounding |
基于神经模块树网络学习的视觉Grounding
|
|
A Fast and Accurate One-Stage Approach to Visual Grounding |
一种快速准确的视觉Grounding方法 |
|
Zero-Shot Grounding of Objects From Natural Language Queries |
基于自然语言查询的对象的零镜头Grounding
|
|
Towards Unconstrained End-to-End Text Spotting |
朝向无约束的端到端文本定位
|
|
What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis |
场景文本识别模型比较有什么问题?数据集与模型分析
|
|
Sparse and Imperceivable Adversarial Attacks |
稀疏而难以想象的对抗性攻击
|
|
Enhancing Adversarial Example Transferability With an Intermediate Level Attack |
使用中级攻击增强对手示例的可转移性
|
|
Implicit Surface Representations As Layers in Neural Networks |
神经网络层中的隐式曲面表示
|
|
A Tour of Convolutional Networks Guided by Linear Interpreters |
线性解释引导的卷积网络之旅 |
|
Small Steps and Giant Leaps: Minimal Newton Solvers for Deep Learning |
小步和大步:深度学习的最小牛顿解
|
|
Semantic Adversarial Attacks: Parametric Transformations That Fool Deep Classifiers |
语义对抗攻击:通过参数转换愚弄深度分类器
|
|
Hilbert-Based Generative Defense for Adversarial Examples |
基于希尔伯特的生成性防御实现对抗例子
|
|
On the Efficacy of Knowledge Distillation |
论知识蒸馏的功效
|
|
Sym-Parameterized Dynamic Inference for Mixed-Domain Image Translation |
混合域图像翻译的Sym参数化动态推理
|
|
Better and Faster: Exponential Loss for Image Patch Matching |
更快更好:图像块匹配的指数损失
|
|
Physical Adversarial Textures That Fool Visual Object Tracking |
物理对抗纹理欺骗视觉对象跟踪
|
|
Wasserstein GAN With Quadratic Transport Cost |
基于二次传输代价的Wasserstein GAN
|
|
Scalable Verified Training for Provably Robust Image Classification |
基于可扩展验证训练的可证明鲁棒图像分类
|
|
Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks |
可微软量化:全精度与低比特神经网络的桥接
|
|
The LogBarrier Adversarial Attack: Making Effective Use of Decision Boundary Information |
LogBarrier对抗攻击:决策边界信息的有效利用 |
|
Proximal Mean-Field for Neural Network Quantization |
基于近场平均场的神经网络量化
|
|
Improving Adversarial Robustness via Guided Complement Entropy |
利用引导互补熵提高对抗稳健性
|
|
A Geometry-Inspired Decision-Based Attack |
基于几何启发的决策攻击
|
|
Universal Perturbation Attack Against Image Retrieval |
图像检索中的普遍扰动攻击
|
|
Bayesian Optimized 1-Bit CNNs |
贝叶斯优化的1-BitCNNs
|
|
Rethinking ImageNet Pre-Training |
对ImageNet预训练的再思考
|
|
Defending Against Universal Perturbations With Shared Adversarial Training |
基于共同对抗性训练的普遍干扰防御
|
|
Adaptive Activation Thresholding: Dynamic Routing Type Behavior for Interpretability in Convolutional Neural Networks |
自适应激活阈值:基于动态路由类型行为的卷积神经网络可解释性
|
|
XRAI: Better Attributions Through Regions |
XRAI:通过区域获得更好的属性
|
|
Guessing Smart: Biased Sampling for Efficient Black-Box Adversarial Attacks |
猜测智能:基于有偏抽样的高效黑盒对抗攻击
|
|
Mask-Guided Attention Network for Occluded Pedestrian Detection |
基于面罩引导注意网络的遮挡行人检测
|
|
Spectral Feature Transformation for Person Re-Identification |
基于谱特征变换的人再识别
|
|
Permutation-Invariant Feature Restructuring for Correlation-Aware Image Set-Based Recognition |
置换不变特征重构实现基于相关感知图像集的图像识别
|
|
Improving Pedestrian Attribute Recognition With Weakly-Supervised Multi-Scale Attribute-Specific Localization |
基于弱监督多尺度属性特定定位的行人属性识别
|
|
Correlation Congruence for Knowledge Distillation |
基于相关同余的知识蒸馏
|
|
Dynamic Curriculum Learning for Imbalanced Data Classification |
基于动态课程学习的不平衡数据分类
|
|
Video Face Clustering With Unknown Number of Clusters |
未知簇数的视频人脸聚类
|
|
Targeted Mismatch Adversarial Attack: Query With a Flower to Retrieve the Tower |
目标不匹配对抗攻击:用花查询以检索塔
|
|
Fashion++: Minimal Edits for Outfit Improvement |
Fashion++:以最小编辑实现服装改进
|
|
Semi-Supervised Pedestrian Instance Synthesis and Detection With Mutual Reinforcement |
基于互增强的半监督行人实例综合与检测
|
|
SILCO: Show a Few Images, Localize the Common Object |
SILCO:显示一些图像,定位公共对象
|
|
A Deep Step Pattern Representation for Multimodal Retinal Image Registration |
多模视网膜图像配准的深度阶跃模式表示
|
|
Deep Graphical Feature Learning for the Feature Matching Problem |
深度图形特征学习解决特征匹配问题 |
|
Minimum Delay Object Detection From Video |
视频的最小延迟目标检测
|
|
Learning With Average Precision: Training Image Retrieval With a Listwise Loss |
平均精度学习:基于列表损失的图像检索训练
|
|
Learning to Find Common Objects Across Few Image Collections |
学习在少数图像集合中查找公共对象
|
|
Weakly Aligned Cross-Modal Learning for Multispectral Pedestrian Detection |
基于弱对齐交叉模式学习的多光谱行人检测
|
|
Deep Self-Learning From Noisy Labels |
从嘈杂的标签中深度自我学习
|
|
DSConv: Efficient Convolution Operator |
DSConv:高效卷积算子
|
|
Once a MAN: Towards Multi-Target Attack via Learning Multi-Target Adversarial Network Once |
一人一次:通过一次学习多目标对抗网络实现多目标攻击
|
|
Explicit Shape Encoding for Real-Time Instance Segmentation |
基于显式形状编码的实时实例分割
|
|
IMP: Instance Mask Projection for High Accuracy Semantic Segmentation of Things |
IMP:用于高精度语义分割的实例掩码投影
|
|
Video Instance Segmentation |
视频实例分割 |
|
Attention Bridging Network for Knowledge Transfer |
基于注意力桥接网络的知识转移
|
|
Self-Supervised Difference Detection for Weakly-Supervised Semantic Segmentation |
基于自监督差分检测的弱监督语义分割
|
|
SPGNet: Semantic Prediction Guidance for Scene Parsing |
SPGNet:基于语义预测指导的场景分析
|
|
Gated-SCNN: Gated Shape CNNs for Semantic Segmentation |
门控SCNN:用于语义分割的门控形状CNN
|
|
DensePoint: Learning Densely Contextual Representation for Efficient Point Cloud Processing |
DensePoint:基于密集上下文表示学习的高效点云处理
|
|
AMP: Adaptive Masked Proxies for Few-Shot Segmentation |
AMP:基于自适应掩蔽代理的少镜头分割
|
|
Universal Semi-Supervised Semantic Segmentation |
通用半监督语义分割
|
|
Accelerate Learning of Deep Hashing With Gradient Attention |
利用梯度注意力加速深度散列学习
|
|
SVD: A Large-Scale Short Video Dataset for Near-Duplicate Video Retrieval |
SVD:一种用于近重复视频检索的大规模短视频数据集
|
|
Block Annotation: Better Image Annotation With Sub-Image Decomposition |
块注释:使用子图像分解更好的图像注释
|
|
Probabilistic Deep Ordinal Regression Based on Gaussian Processes |
基于高斯过程的概率深度序数回归
|
|
Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias in Deep Image Representations |
平衡的数据集是不够的:估计和减轻深度图像表现中的性别偏见
|
|
Teacher Guided Architecture Search |
教师指导的架构搜索
|
|
FACSIMILE: Fast and Accurate Scans From an Image in Less Than a Second |
FACSIMILE:在不到一秒钟的时间内快速准确地扫描图像
|
|
Delving Deep Into Hybrid Annotations for 3D Human Recovery in the Wild |
深入研究混合标注在野外三维人体复原中的应用 |
|
Human Mesh Recovery From Monocular Images via a Skeleton-Disentangled Representation |
基于骨架分离表示的单目图像人体网格恢复
|
|
Three-D Safari: Learning to Estimate Zebra Pose, Shape, and Texture From Images "In the Wild" |
三维漫游:学习从“野外”图像中估计斑马的姿势、形状和纹理
|
|
Object-Driven Multi-Layer Scene Decomposition From a Single Image |
基于单个图像的对象驱动多层场景分解
|
|
Occupancy Flow: 4D Reconstruction by Learning Particle Dynamics |
占用流:基于粒子动力学的4d重建
|
|
Joint Monocular 3D Vehicle Detection and Tracking |
单目三维车辆联合检测与跟踪
|
|
Fingerspelling Recognition in the Wild With Iterative Visual Attention |
基于迭代视觉注意力的野外手指拼写识别
|
|
PointAE: Point Auto-Encoder for 3D Statistical Shape and Texture Modelling |
PointAE:用于三维统计形状和纹理建模的点自动编码器
|
|
Multi-Garment Net: Learning to Dress 3D People From Images |
多服装网:从图像中学习三维人体着装
|
|
Skeleton-Aware 3D Human Shape Reconstruction From Point Clouds |
基于点云的骨骼感知三维人体形状重建
|
|
AMASS: Archive of Motion Capture As Surface Shapes |
AMASS:作为表面形状的运动捕捉存档
|
|
Person-in-WiFi: Fine-Grained Person Perception Using WiFi |
WIFI中的人:使用WIFI的细粒度的人感知
|
|
FAB: A Robust Facial Landmark Detection Framework for Motion-Blurred Videos |
FAB:一种鲁棒的运动模糊视频人脸地标检测框架 |
|
Attentional Feature-Pair Relation Networks for Accurate Face Recognition |
基于注意力特征对关系网络的精确人脸识别
|
|
Action Recognition With Spatial-Temporal Discriminative Filter Banks |
基于时空判别滤波器组的动作识别
|
|
EPIC-Fusion: Audio-Visual Temporal Binding for Egocentric Action Recognition |
EPIC融合:用于自我中心行为识别的视听时间绑定
|
|
Weakly-Supervised Action Localization With Background Modeling |
基于背景建模的弱监督动作定位
|
|
Grouped Spatial-Temporal Aggregation for Efficient Action Recognition |
基于分组时空聚合的动作识别
|
|
Temporal Structure Mining for Weakly Supervised Action Detection |
弱监督动作检测的时间结构挖掘
|
|
Temporal Recurrent Networks for Online Action Detection |
用于在线动作检测的时间递归网络
|
|
StartNet: Online Detection of Action Start in Untrimmed Videos |
StartNet:未剪辑视频中动作开始的在线检测
|
|
Video Classification With Channel-Separated Convolutional Networks |
基于通道分离卷积网络的视频分类
|
|
Predicting the Future: A Jointly Learnt Model for Action Anticipation |
预测未来:一个基于共同学习的行动预测模型 |
|
Human-Aware Motion Deblurring |
人体感知运动去模糊
|
|
Fast Video Object Segmentation via Dynamic Targeting Network |
基于动态目标网络的视频对象快速分割
|
|
Solving Vision Problems via Filtering |
通过滤波解决视觉问题
|
|
GAN-Based Projector for Faster Recovery With Convergence Guarantees in Linear Inverse Problems |
线性反问题中基于GAN的投影实现具有收敛保证的更快恢复
|
|
Scoot: A Perceptual Metric for Facial Sketches |
Scoot:基于感知测度的面部草图
|
|
Learning Filter Basis for Convolutional Neural Network Compression |
基于滤波基学习的卷积神经网络压缩
|
|
End-to-End Learning of Representations for Asynchronous Event-Based Data |
端到端表示学习实现异步基于事件的数据
|
|
ERL-Net: Entangled Representation Learning for Single Image De-Raining |
ERL网:基于纠缠表示学习的单图像去雨
|
|
Perceptual Deep Depth Super-Resolution |
感知深度超分辨率
|
|
3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera |
三维场景图:用于统一语义、三维空间和相机的结构
|
|
Floorplan-Jigsaw: Jointly Estimating Scene Layout and Aligning Partial Scans |
平面拼图:联合估计场景布局和对齐部分扫描
|
|
Enforcing Geometric Constraints of Virtual Normal for Depth Prediction |
基于虚拟法向几何约束的深度预测
|
|
Deep Contextual Attention for Human-Object Interaction Detection |
基于深度上下文注意的人-对象交互检测
|
|
Learning Compositional Neural Information Fusion for Human Parsing |
用于人类分析的合成神经信息融合学习 |
|
Attentional Neural Fields for Crowd Counting |
人群计数的注意神经场
|
|
Understanding Human Gaze Communication by Spatio-Temporal Graph Reasoning |
用时空图推理理解人的凝视交流
|
|
Controllable Attention for Structured Layered Video Decomposition |
基于可控注意的结构化分层视频分解
|
|
GANalyze: Toward Visual Definitions of Cognitive Image Properties |
认知图像属性的视觉定义
|
|
Saliency-Guided Attention Network for Image-Sentence Matching |
显著性引导注意力网络在图像-句子匹配中的应用 |
|
CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval |
CAMP:用于文本-图像检索的跨模式自适应消息传递
|
|
ACMM: Aligned Cross-Modal Memory for Few-Shot Image and Sentence Matching |
ACMM:用于少镜头图像和句子匹配的对齐跨模态存储器
|
|
Creativity Inspired Zero-Shot Learning |
创意激发零镜头学习
|
|
Generating Easy-to-Understand Referring Expressions for Target Identifications |
为目标识别生成易于理解的指代表达
|
|
Language-Agnostic Visual-Semantic Embeddings |
语言不可知的视觉语义嵌入
|
|
Adversarial Representation Learning for Text-to-Image Matching |
文本-图像匹配中的对抗表示学习
|
|
Multi-Modality Latent Interaction Network for Visual Question Answering |
视觉问答的多模态潜在交互网络
|
|
Key.Net: Keypoint Detection by Handcrafted and Learned CNN Filters |
Key.Net:基于手工特征和CNN过滤器学习的关键点检测
|
|
Learning Two-View Correspondences and Geometry Using Order-Aware Network |
基于顺序感知网络的两视图对应与几何学习 |
|
Learning Meshes for Dense Visual SLAM |
稠密视觉SLAM的学习网格
|
|
EM-Fusion: Dynamic Object-Level SLAM With Probabilistic Data Association |
EM融合:基于概率数据关联的动态对象级SLAM
|
|
ClusterSLAM: A SLAM Backend for Simultaneous Rigid Body Clustering and Motion Estimation |
ClusterSLAM:同时进行刚体聚类和运动估计的SLAM后端
|
|
Efficient and Robust Registration on the 3D Special Euclidean Group |
三维特殊欧氏群的高效鲁棒配准
|
|
Algebraic Characterization of Essential Matrices and Their Averaging in Multiview Settings |
多视图环境下本质矩阵的代数特征及其平均
|
|
Liquid Warping GAN: A Unified Framework for Human Motion Imitation, Appearance Transfer and Novel View Synthesis |
液体翘曲GAN:一个统一的人体运动模拟、外观传递和新视角合成框架 |
|
RelGAN: Multi-Domain Image-to-Image Translation via Relative Attributes |
RelGAN:基于相对属性的多域图像-图像转换
|
|
Attribute-Driven Spontaneous Motion in Unpaired Image Translation |
非成对图像翻译中的属性驱动自发运动
|
|
Everybody Dance Now |
现在大家都跳舞
|
|
Multimodal Style Transfer via Graph Cuts |
基于图割的多模态转移
|
|
A Closed-Form Solution to Universal Style Transfer |
通用样式转换的一种闭式解法
|
|
Progressive Reconstruction of Visual Structure for Image Inpainting |
图像修补中视觉结构的渐进重建
|
|
Variational Adversarial Active Learning |
变分对抗性主动学习
|
主动学习:让学习算法主动地提出要对哪些数据进行标注 |
Confidence Regularized Self-Training |
基于自信心正则化的自训练
|
|
Anchor Loss: Modulating Loss Scale Based on Prediction Difficulty |
锚损失:基于预测难度的调整损失尺度
|
|
Local Aggregation for Unsupervised Learning of Visual Embeddings |
基于局部聚集的无监督视觉嵌入学习
|
|
PR Product: A Substitute for Inner Product in Neural Networks |
PR乘积:神经网络内积的一种代用品
|
|
CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features |
CutMix:训练具有局部特征的强分类器的正则化策略
|
|
Towards Interpretable Object Detection by Unfolding Latent Structures |
基于潜在结构展开的可解释目标检测
|
|
Scaling Object Detection by Transferring Classification Weights |
基于分类权重转移的分级目标检测
|
|
Scale-Aware Trident Networks for Object Detection |
基于尺度感知的Trident网络实现目标检测
|
|
Object-Aware Instance Labeling for Weakly Supervised Object Detection |
基于目标感知实例标记的弱监督目标检测
|
|
Generative Modeling for Small-Data Object Detection |
小数据目标检测的生成模型
|
|
Transductive Learning for Zero-Shot Object Detection |
基于导纳学习的零镜头目标检测
|
|
Self-Training and Adversarial Background Regularization for Unsupervised Domain Adaptive One-Stage Object Detection |
基于自训练与对抗背景正则化的无监督域自适应单阶段目标检测
|
|
Memory-Based Neighbourhood Embedding for Visual Recognition |
基于记忆的邻域嵌入实现视觉识别
|
|
Self-Similarity Grouping: A Simple Unsupervised Cross Domain Adaptation Approach for Person Re-Identification |
自相似分组:一种简单的无监督跨域自适应方法 |
|
Deep Reinforcement Active Learning for Human-in-the-Loop Person Re-Identification |
基于深度强化主动学习的回路中人的人再识别
|
|
A Dual-Path Model With Adaptive Attention for Vehicle Re-Identification |
一种具有自适应注意的双路径模型实现车辆重识别
|
|
Bayesian Loss for Crowd Count Estimation With Point Supervision |
贝叶斯损失用于基于点监督的人群计数
|
|
Learning Spatial Awareness to Improve Crowd Counting |
空间感知学习提高人群计数
|
|
GradNet: Gradient-Guided Network for Visual Object Tracking |
GradNet:基于梯度引导网络的视觉目标跟踪
|
|
FAMNet: Joint Learning of Feature, Affinity and Multi-Dimensional Assignment for Online Multiple Object Tracking |
FAMNet:基于特征、亲和力和多维分配联合学习的在线多目标跟踪
|
|
Learning Discriminative Model Prediction for Tracking |
基于判别模型预测学习的跟踪
|
|
DynamoNet: Dynamic Action and Motion Network |
动态动作与运动网络 |
|
SlowFast Networks for Video Recognition |
用于视频识别的SlowFast网络
|
|
Generative Multi-View Human Action Recognition |
生成性多视角人类行为识别
|
|
Multi-Agent Reinforcement Learning Based Frame Sampling for Effective Untrimmed Video Recognition |
基于多智能体增强学习的帧采样实现有效的未经修剪视频识别
|
|
SCSampler: Sampling Salient Clips From Video for Efficient Action Recognition |
SCSampler:从视频中抽取显著片段以实现高效的动作识别
|
|
Weakly Supervised Energy-Based Learning for Action Segmentation |
弱监督基于能量的学习实现动作分割
|
|
What Would You Expect? Anticipating Egocentric Actions With Rolling-Unrolling LSTMs and Modality Attention |
你期望什么?以滚动-展开的LSTMs和情态注意预测自我中心行为
|
|
PIE: A Large-Scale Dataset and Models for Pedestrian Intention Estimation and Trajectory Prediction |
PIE:用于行人意图估计和轨迹预测的大规模数据集和模型
|
|
STGAT: Modeling Spatial-Temporal Interactions for Human Trajectory Prediction |
STGAT:用于人类轨迹预测的时-空交互建模
|
|
Learning Motion in Feature Space: Locally-Consistent Deformable Convolution Networks for Fine-Grained Action Detection |
特征空间中的运动学习:基于局部一致可变形卷积网络的细粒度动作检测
|
|
Dual Attention Matching for Audio-Visual Event Localization |
基于双注意匹配的视-听事件定位
|
|
Uncertainty-Aware Audiovisual Activity Recognition Using Deep Bayesian Variational Inference |
基于深度贝叶斯变分推理的不确定性感知的视-听活动识别
|
|
Non-Local Recurrent Neural Memory for Supervised Sequence Modeling |
基于非局部递归神经记忆的监督序列建模
|
|
Temporal Attentive Alignment for Large-Scale Video Domain Adaptation |
基于时间注意力对齐的大规模视频域自适应
|
|
Action Assessment by Joint Relation Graphs |
基于联合关系图的行动评估
|
|
Unsupervised Procedure Learning via Joint Dynamic Summarization |
基于联合动态摘要的无监督过程学习
|
|
ViSiL: Fine-Grained Spatio-Temporal Video Similarity Learning |
ViSiL:细粒度时-空视频相似度学习
|
|
Unsupervised Learning of Landmarks by Descriptor Vector Exchange |
基于描述向量交换的无监督地标学习
|
|
Learning Compositional Representations for Few-Shot Recognition |
基于合成表示学习的少镜头识别
|
|
Spectral Regularization for Combating Mode Collapse in GANs |
基于谱正则化的GANs抗模式崩溃
|
|
Scaling and Benchmarking Self-Supervised Visual Representation Learning |
自监督视觉表示学习的标度与标杆
|
|
Learning an Effective Equivariant 3D Descriptor Without Supervision |
无监督学习一种有效的等变三维描述子
|
|
KPConv: Flexible and Deformable Convolution for Point Clouds |
KPConv:用于点云的柔性可变形卷积
|
|
Neural Inter-Frame Compression for Video Coding |
基于神经帧间压缩的视频编码
|
|
Task2Vec: Task Embedding for Meta-Learning |
Task2Vec:基于任务嵌入的元学习
|
|
Deep Clustering by Gaussian Mixture Variational Autoencoders With Graph Embedding |
图嵌入实现基于高斯混合变分自编码的深度聚类
|
|
SoftTriple Loss: Deep Metric Learning Without Triplet Sampling |
软三元损失:无三元抽样的深度度量学习
|
|
A Weakly Supervised Fine Label Classifier Enhanced by Coarse Supervision |
一种基于粗监督的弱监督精细标记分类器
|
|
Gaussian Affinity for Max-Margin Class Imbalanced Learning |
基于高斯亲合性的最大边缘类非平衡学习
|
|
AttPool: Towards Hierarchical Feature Representation in Graph Convolutional Networks via Attention Mechanism |
AttPool:基于注意力机制的图形卷积网络层次特征表示
|
|
Deep Metric Learning With Tuplet Margin Loss |
具有三元边缘损失的深度度量学习
|
|
Normalized Wasserstein for Mixture Distributions With Applications in Adversarial Learning and Domain Adaptation |
基于标准化Wasserstein的混合分布在对抗学习和域自适应中的应用 |
|
Fast and Practical Neural Architecture Search |
快速实用的神经网络架构搜索
|
|
Symmetric Graph Convolutional Autoencoder for Unsupervised Graph Representation Learning |
基于对称图卷积自动编码器的无监督图表示学习
|
|
Deep Elastic Networks With Model Selection for Multi-Task Learning |
基于模型选择的深度弹性网络实现多任务学习
|
|
Metric Learning With HORDE: High-Order Regularizer for Deep Embeddings |
基于HORDE的度量学习:深度嵌入的高阶正则化
|
|
Adversarial Learning With Margin-Based Triplet Embedding Regularization |
基于边缘的三元嵌入正则化实现对抗学习
|
|
Simultaneous Multi-View Instance Detection With Learned Geometric Soft-Constraints |
基于学习几何软约束的多视图同时实例检测
|
|
CenterNet: Keypoint Triplets for Object Detection |
CenterNet:基于关键点三元组的对象检测
|
|
Online Hyper-Parameter Learning for Auto-Augmentation Strategy |
基于在线超参数学习的自增强策略
|
|
DANet: Divergent Activation for Weakly Supervised Object Localization |
DANet:基于发散激活的弱监督目标定位
|
|
Selective Sparse Sampling for Fine-Grained Image Recognition |
基于选择性稀疏采样的细粒度图像识别
|
|
Dynamic Anchor Feature Selection for Single-Shot Object Detection |
基于动态锚特征选择的单镜头目标检测
|
|
Incremental Learning Using Conditional Adversarial Networks |
基于条件对抗网络的增量学习
|
|
Bilateral Adversarial Training: Towards Fast Training of More Robust Models Against Adversarial Attacks |
双边对抗性训练:快速训练更强大的抗对抗性攻击模型 |
|
View Confusion Feature Learning for Person Re-Identification |
基于视图混淆特征学习的人再识别
|
|
Auto-FPN: Automatic Network Architecture Adaptation for Object Detection Beyond Classification |
Auto-FPN:自动网络体系结构自适应实现超越分类的目标检测
|
|
PARN: Position-Aware Relation Networks for Few-Shot Learning |
PARN:基于位置感知关系网络的少镜头学习
|
|
Multi-Adversarial Faster-RCNN for Unrestricted Object Detection |
基于多对抗Faster-RCNN的无限制目标检测
|
|
Object Guided External Memory Network for Video Object Detection |
基于目标引导外存网络的视频目标检测
|
|
An Empirical Study of Spatial Attention Mechanisms in Deep Networks |
深度网络中空间注意机制的实证研究
|
|
Attribute Attention for Semantic Disambiguation in Zero-Shot Learning |
零镜头学习中基于属性注意的语义消歧
|
|
CIIDefence: Defeating Adversarial Attacks by Fusing Class-Specific Image Inpainting and Image Denoising |
CIIDefence:通过融合特定类别的图像修复和图像去噪来战胜对抗性攻击
|
|
ThunderNet: Towards Real-Time Generic Object Detection on Mobile Devices |
ThunderNet:面向移动设备的实时通用目标检测
|
|
Dual Student: Breaking the Limits of the Teacher in Semi-Supervised Learning |
双重学生:打破教师在半监督学习中的局限
|
|
MVP Matching: A Maximum-Value Perfect Matching for Mining Hard Samples, With Application to Person Re-Identification |
MVP匹配:挖掘难样本的极大值完全匹配方法及其在人再识别中的应用 |
|
Adaptive Context Network for Scene Parsing |
用于场景分析的自适应上下文网络
|
|
Constructing Self-Motivated Pyramid Curriculums for Cross-Domain Semantic Segmentation: A Non-Adversarial Approach |
基于自我激励金字塔课程的跨域语义分割:一种非对抗性方法 |
课程学习:基于局部分布 自我激励:基于潜变量 本文将两种方式结合起来,并结合金字塔技术,实现域自适应的语义分割 |
SparseMask: Differentiable Connectivity Learning for Dense Image Prediction |
SparseMask:用于稠密图像预测的可微连通学习
|
|
Significance-Aware Information Bottleneck for Domain Adaptive Semantic Segmentation |
基于重要性感知信息Bottleneck的域自适应语义分割
|
基于GAN的域自适应语义分割的改进,对潜变量进行重要性感知的限制(如图2,3) |
Relational Attention Network for Crowd Counting |
基于关系注意力网络的人群计数
|
|
ACFNet: Attentional Class Feature Network for Semantic Segmentation |
ACFNet:基于注意力类特征网络的语义分割
|
一种利用类别特征进行语义分割refine的方法,如图2,3。 在粗粒度的语义分割基础上,提取不同类别的特征,进一步由不同类别的特征,对骨干网提出的特征进行Attention,并在此基础上refine |
Frame-to-Frame Aggregation of Active Regions in Web Videos for Weakly Supervised Semantic Segmentation |
基于web视频活动区域帧间聚合的弱监督语义分割
|
|
Boundary-Aware Feature Propagation for Scene Segmentation |
基于边界感知特征传播的场景分割
|
|
Self-Ensembling With GAN-Based Data Augmentation for Domain Adaptation in Semantic Segmentation |
基于GAN的数据增强的自组织在域自适应语义分割中的应用 |
|
Explaining the Ambiguity of Object Detection and 6D Pose From Visual Data |
从视觉数据解释目标检测和6d姿态的模糊性
|
|
Accurate Monocular 3D Object Detection via Color-Embedded 3D Reconstruction for Autonomous Driving |
基于彩色嵌入三维重建的单目三维物体精确检测在自动驾驶的应用 |
|
MonoLoco: Monocular 3D Pedestrian Localization and Uncertainty Estimation |
单目三维行人定位及不确定性估计
|
|
Unsupervised High-Resolution Depth Learning From Videos With Dual Networks |
基于双网络的视频无监督高分辨率深度学习
|
|
Bayesian Graph Convolution LSTM for Skeleton Based Action Recognition |
贝叶斯图卷积LSTM实现基于骨架的动作识别
|
|
DeCaFA: Deep Convolutional Cascade for Face Alignment in the Wild |
DeCaFa:基于深度卷积级联的野外人脸定位
|
|
Probabilistic Face Embeddings |
概率人脸嵌入
|
|
Gaze360: Physically Unconstrained Gaze Estimation in the Wild |
Gaze360:野外自然无约束凝视估计
|
|
Unsupervised Person Re-Identification by Camera-Aware Similarity Consistency Learning |
基于摄像机感知相似一致性学习的无监督人再识别
|
|
Photo-Realistic Monocular Gaze Redirection Using Generative Adversarial Networks |
基于GAN的单目注视重定向
|
|
Dynamic Kernel Distillation for Efficient Pose Estimation in Videos |
动态核蒸馏在视频位姿估计中的应用 |
|
Single-Stage Multi-Person Pose Machines |
单级多人位姿机 |
|
SO-HandNet: Self-Organizing Network for 3D Hand Pose Estimation With Semi-Supervised Learning |
So-HandNet:基于自组织网络的半监督三维手姿态估计
|
|
Adaptive Wing Loss for Robust Face Alignment via Heatmap Regression |
利用热图回归实现基于Wing损失的鲁棒人脸对齐
|
|
Single-Network Whole-Body Pose Estimation |
单网络全身姿态估计
|
|
Face Alignment With Kernel Density Deep Neural Network |
基于核密度深度神经网络的人脸对齐
|
|
Spatiotemporal Feature Residual Propagation for Action Prediction |
基于时空特征残差传播的动作预测
|
|
Identity From Here, Pose From There: Self-Supervised Disentanglement and Generation of Objects Using Unlabeled Videos |
从这里来的身份,从那里来的姿势:使用无标签视频的自监督分离和对象生成
|
|
Relation Distillation Networks for Video Object Detection |
基于关系蒸馏网络的视频对象检测
|
|
Video Compression With Rate-Distortion Autoencoders |
基于率失真自编码器的视频压缩
|
|
Non-Local ConvLSTM for Video Compression Artifact Reduction |
基于非局部ConvLSTM的视频压缩伪影减少
|
|
Self-Supervised Moving Vehicle Tracking With Stereo Sound |
基于立体声的自监督运动车辆跟踪
|
|
Self-Supervised Learning With Geometric Constraints in Monocular Video: Connecting Flow, Depth, and Camera |
单目视频中带几何约束的自监督学习:连接流、深度和摄像机
|
|
Learning Temporal Action Proposals With Fewer Labels |
用较少的标签学习时域行动建议
|
|
TSM: Temporal Shift Module for Efficient Video Understanding |
TSM:基于时域转换模块的高效视频理解
|
|
Graph Convolutional Networks for Temporal Action Localization |
基于图卷积网络的时域动作定位
|
|
Fast Object Detection in Compressed Video |
压缩视频中的快速目标检测
|
|
Predicting 3D Human Dynamics From Video |
视频的三维人体动力学预测
|
|
Imitation Learning for Human Pose Prediction |
基于模拟学习的人体姿态预测
|
|
Human Motion Prediction via Spatio-Temporal Inpainting |
基于时空修复的人体运动预测
|
|
Structured Prediction Helps 3D Human Motion Modelling |
结构化预测有助于三维人体运动建模
|
|
Learning Shape Templates With Structured Implicit Functions |
基于结构化隐函数的形状模板学习
|
|
CompenNet++: End-to-End Full Projector Compensation |
CompenNet++:端到端的完整投影仪补偿
|
|
Deep Parametric Indoor Lighting Estimation |
深度参数化室内照明估算
|
|
FSGAN: Subject Agnostic Face Swapping and Reenactment |
FSGAN:主体不可知的人脸交换和重生成
|
|
Deep Single-Image Portrait Relighting |
深度单像人像Relighting
|
|
PU-GAN: A Point Cloud Upsampling Adversarial Network |
PU-GAN:一种点云上采样对抗网络
|
|
Neural 3D Morphable Models: Spiral Convolutional Networks for 3D Shape Representation Learning and Generation |
神经三维变形模型:螺旋卷积网络在三维形状表示学习与生成中的应用 |
|
Joint Learning of Saliency Detection and Weakly Supervised Semantic Segmentation |
显著性检测与弱监督语义分割的联合学习
|
弱监督语义分割:输入两类训练集(像素级显著性训练集和类别级分类训练集),训练后的像素级语义分割 |
Towards High-Resolution Salient Object Detection |
高分辨率显著目标检测
|
|
Event-Based Motion Segmentation by Motion Compensation |
利用运动补偿实现基于事件的运动分割
|
|
Depth-Induced Multi-Scale Recurrent Attention Network for Saliency Detection |
基于深度诱导多尺度递归注意力网络的显著性检测
|
|
Stacked Cross Refinement Network for Edge-Aware Salient Object Detection |
基于叠层交叉求精网络的边缘感知显著目标检测
|
|
Motion Guided Attention for Video Salient Object Detection |
基于运动引导注意力的视频显著目标检测
|
|
Semi-Supervised Video Salient Object Detection Using Pseudo-Labels |
基于伪标签的半监督视频显著目标检测
|
|
Joint Learning of Semantic Alignment and Object Landmark Detection |
语义对齐与目标标志检测的联合学习
|
|
RainFlow: Optical Flow Under Rain Streaks and Rain Veiling Effect |
雨流:雨带和雨幕效应下的光流
|
|
GridDehazeNet: Attention-Based Multi-Scale Network for Image Dehazing |
GridDehazeNet:基于注意力的多尺度图像去雾网络
|
|
Learning to See Moving Objects in the Dark |
学会在黑暗中看到移动的物体
|
|
SegSort: Segmentation by Discriminative Sorting of Segments |
SegSort:通过判别性分段排序来分割
|
|
What Synthesis Is Missing: Depth Adaptation Integrated With Weak Supervision for Indoor Scene Parsing |
合成缺少什么:深度自适应与弱监督相结合的室内场景分析
|
|
AdaptIS: Adaptive Instance Selection Network |
AdaptIS:自适应实例选择网络
|
|
DADA: Depth-Aware Domain Adaptation in Semantic Segmentation |
DADA:基于深度感知域自适应的语义分割
|
|
Guided Curriculum Model Adaptation and Uncertainty-Aware Evaluation for Semantic Nighttime Image Segmentation |
基于引导课程模型自适应和不确定性感知评价的夜间图像语义分割
|
课程学习、自适应、夜间图像的语义分割 |
SceneGraphNet: Neural Message Passing for 3D Indoor Scene Augmentation |
SceneGraphNet:基于神经信息传递的三维室内场景增强
|
|
SkyScapes Fine-Grained Semantic Understanding of Aerial Scenes |
空中场景的精细语义理解
|
|
Transferable Representation Learning in Vision-and-Language Navigation |
基于可迁移表示学习的视觉与语言导航
|
|
Towards Unsupervised Image Captioning With Shared Multimodal Embeddings |
基于共享多模式嵌入的无监督图像标注
|
|
ViCo: Word Embeddings From Visual Co-Occurrences |
由视觉共现实现词嵌入
|
|
Seq-SG2SL: Inferring Semantic Layout From Scene Graph Through Sequence to Sequence Learning |
Seq-SG2SL:通过序列到序列学习从场景图推断语义布局
|
|
U-CAM: Visual Explanation Using Uncertainty Based Class Activation Maps |
U-CAM:基于不确定性的类激活图实现可视化解释
|
|
See-Through-Text Grouping for Referring Image Segmentation |
基于透明文本分组的参考图像分割
|
|
VideoBERT: A Joint Model for Video and Language Representation Learning |
VideoBERT:一种视频和语言表示学习的联合模型
|
|
Language Features Matter: Effective Language Representations for Vision-Language Tasks |
语言特征的重要性:视觉-语言任务的有效语言表示
|
|
Semantic Stereo Matching With Pyramid Cost Volumes |
基于金字塔CostVolume的语义立体匹配
|
1. 采用语义分割提升立体匹配 2. 采用不同尺度的CostVolume |
Spatial Correspondence With Generative Adversarial Network: Learning Depth From Monocular Videos |
基于GAN的空间对应:单目视频的学习深度
|
|
Learning Relationships for Multi-View 3D Object Recognition |
基于关系学习的多视图三维目标识别
|
|
View N-Gram Network for 3D Object Retrieval |
基于视图N-Gram网络的三维对象检索
|
|
Expert Sample Consensus Applied to Camera Re-Localization |
专家样本一致性在相机再定位中的应用 |
|
Semantic Part Detection via Matching: Learning to Generalize to Novel Viewpoints From Limited Training Data |
基于匹配的语义部分检测:学习从有限的训练数据推广到新的视点
|
|
Dynamic Points Agglomeration for Hierarchical Point Sets Learning |
基于动态点聚集的层次点集学习
|
|
Attributing Fake Images to GANs: Learning and Analyzing GAN Fingerprints |
将假图像归因于GANs:GAN指纹的学习与分析
|
|
Dual Adversarial Inference for Text-to-Image Synthesis |
基于双对抗推理的文本到图像合成
|
|
View-LSTM: Novel-View Video Synthesis Through View Decomposition |
视图LSTM:一种新的基于视图分解的新视图视频合成方法 |
|
HoloGAN: Unsupervised Learning of 3D Representations From Natural Images |
HoloGAN:自然图像三维表示的无监督学习
|
|
Unpaired Image-to-Speech Synthesis With Multimodal Information Bottleneck |
基于多模态信息Bottleneck的非配对图像-语音合成
|
|
Improved Conditional VRNNs for Video Prediction |
基于条件VRNNs的视频预测改进
|
|
Visualizing the Invisible: Occluded Vehicle Segmentation and Recovery |
可视化看不见:遮挡车辆分割与恢复
|
|
Learning Single Camera Depth Estimation Using Dual-Pixels |
利用双像素学习单摄像机深度估计
|
|
Domain-Adaptive Single-View 3D Reconstruction |
域自适应单视图三维重建 |
|
Transformable Bottleneck Networks |
可转换Bottleneck网络 |
|
RIO: 3D Object Instance Re-Localization in Changing Indoor Environments |
RIO:在变化的室内环境中3D对象实例的重新定位
|
|
Pix2Pose: Pixel-Wise Coordinate Regression of Objects for 6D Pose Estimation |
Pix2Pose:基于逐像素坐标回归的6D姿态估计
|
|
CDPN: Coordinates-Based Disentangled Pose Network for Real-Time RGB-Based 6-DoF Object Pose Estimation |
CDPN:基于坐标的解纠缠位姿网络实现实时基于RGB的六自由度目标位姿估计
|
|
C3DPO: Canonical 3D Pose Networks for Non-Rigid Structure From Motion |
C3DPO:基于标准三维位姿网络的非刚性Structure From Motion
|
|
Learning to Reconstruct 3D Manhattan Wireframes From a Single Image |
学习从单个图像重建曼哈顿三维线框
|
|
Soft Rasterizer: A Differentiable Renderer for Image-Based 3D Reasoning |
软光栅化器:一种可微渲染器实现基于图像的三维推理
|
|
Learnable Triangulation of Human Pose |
人体姿势三角剖分学习
|
|
xR-EgoPose: Egocentric 3D Human Pose From an HMD Camera |
xR-EgoPose:HMD相机的以自我为中心的3D人体姿势
|
|
DeepHuman: 3D Human Reconstruction From a Single Image |
DeepHuman:从单个图像重建三维人体
|
|
A Neural Network for Detailed Human Depth Estimation From a Single Image |
单幅图像的人体深度精细估计神经网络
|
|
DenseRaC: Joint 3D Pose and Shape Estimation by Dense Render-and-Compare |
DenseRaC:基于稠密渲染和比较的联合三维姿态和形状估计
|
|
Not All Parts Are Created Equal: 3D Pose Estimation by Modeling Bi-Directional Dependencies of Body Parts |
并非所有的部分都是平等地创建:通过建立身体部分的双向依赖关系来估计三维姿势
|
|
Extreme View Synthesis |
极限视图合成 |
|
View Independent Generative Adversarial Network for Novel View Synthesis |
视图无关GAN在新视图合成中的应用 |
|
Cascaded Context Pyramid for Full-Resolution 3D Semantic Scene Completion |
基于级联上下文金字塔的全分辨率三维语义场景补全
|
|
View-Consistent 4D Light Field Superpixel Segmentation |
视图一致的4D光场超像素分割
|
|
GLoSH: Global-Local Spherical Harmonics for Intrinsic Image Decomposition |
GLoSH:用于内在图像分解的全局-局部球谐函数
|
|
Surface Normals and Shape From Water |
水面法向量和形状
|
|
Restoration of Non-Rigidly Distorted Underwater Images Using a Combination of Compressive Sensing and Local Polynomial Image Representations |
基于压缩传感和局部多项式图像表示组合的非刚性畸变水下图像复原
|
|
Learning Perspective Undistortion of Portraits |
学习肖像画去失真视角
|
|
Towards Photorealistic Reconstruction of Highly Multiplexed Lensless Images |
高复用无透镜图像的真实感重建
|
|
Unconstrained Motion Deblurring for Dual-Lens Cameras |
双镜头相机的无约束运动去模糊
|
|
Stochastic Exposure Coding for Handling Multi-ToF-Camera Interference |
处理多TOF相机干扰的随机曝光编码
|
|
Convolutional Approximations to the General Non-Line-of-Sight Imaging Operator |
一般非视线成像算子的卷积逼近
|
|
Agile Depth Sensing Using Triangulation Light Curtains |
基于三角光幕的快速深度传感
|
|
Asynchronous Single-Photon 3D Imaging |
异步单光子三维成像
|
|
Cross-Dataset Person Re-Identification via Unsupervised Pose Disentanglement and Adaptation |
基于无监督姿势分离和自适应的跨数据集人再识别
|
|
A Learned Representation for Scalable Vector Graphics |
基于表示学习的可伸缩矢量图形
|
|
ELF: Embedded Localisation of Features in Pre-Trained CNN |
ELF:在预先训练的CNN中嵌入特征定位
|
|
Joint Group Feature Selection and Discriminative Filter Learning for Robust Visual Object Tracking |
基于联合组特征选择和判别滤波器学习的鲁棒视觉目标跟踪
|
|
Sampling Wisely: Deep Image Embedding by Top-K Precision Optimization |
明智采样:基于Top-K精度优化的深度图像嵌入
|
|
On the Global Optima of Kernelized Adversarial Representation Learning |
核化对抗表征学习的全局优化
|
|
Addressing Model Vulnerability to Distributional Shifts Over Image Transformation Sets |
解决图像转换集上分布移位的模型脆弱性
|
|
Attract or Distract: Exploit the Margin of Open Set |
吸引或分散注意力:探索开放集的边缘
|
|
MIC: Mining Interclass Characteristics for Improved Metric Learning |
MIC:挖掘类间特征以改进度量学习
|
|
Self-Supervised Representation Learning via Neighborhood-Relational Encoding |
基于邻域关系编码的自监督表示学习
|
|
AWSD: Adaptive Weighted Spatiotemporal Distillation for Video Representation |
自适应加权时空蒸馏在视频表示中的应用 |
|
Bilinear Attention Networks for Person Retrieval |
用于人检索的双线性注意网络
|
|
Discriminative Feature Learning With Consistent Attention Regularization for Person Re-Identification |
基于一致注意正则化的判别特征学习用于人再识别
|
|
Semi-Supervised Domain Adaptation via Minimax Entropy |
基于极大极小熵的半监督域自适应
|
|
Boosting Few-Shot Visual Learning With Self-Supervision |
自我监督促进少镜头视觉学习
|
|
FDA: Feature Disruptive Attack |
FDA:功能破坏性攻击
|
|
A Novel Unsupervised Camera-Aware Domain Adaptation Framework for Person Re-Identification |
一种新的无监督摄像机感知域自适应框架实现人再识别
|
|
Recover and Identify: A Generative Dual Model for Cross-Resolution Person Re-Identification |
恢复与识别:一种生成性双重模型实现交叉分辨人再识别
|
|
Cross-View Policy Learning for Street Navigation |
用于街道导航的交叉视野策略学习
|
|
Learning Across Tasks and Domains |
跨任务、跨领域学习 |
|
EMPNet: Neural Localisation and Mapping Using Embedded Memory Points |
EMPNet:基于嵌入式存储点的神经定位与映射
|
|
AVT: Unsupervised Learning of Transformation Equivariant Representations by Autoencoding Variational Transformations |
AVT:自编码变分变换实现变换等变表示的无监督学习
|
|
Composite Shape Modeling via Latent Space Factorization |
基于潜在空间分解的复合形状建模
|
|
Deep Comprehensive Correlation Mining for Image Clustering |
基于深度综合相关挖掘的图像聚类
|
|
Unsupervised Multi-Task Feature Learning on Point Clouds |
点云上的无监督多任务特征学习
|
|
Reciprocal Multi-Layer Subspace Learning for Multi-View Clustering |
基于互反多层子空间学习的多视图聚类
|
|
Geometric Disentanglement for Generative Latent Shape Models |
基于几何解缠的生成性潜在形状模型
|
|
GAN-Tree: An Incrementally Learned Hierarchical Generative Framework for Multi-Modal Data Distributions |
GAN-Tree:一种多模态数据分布的增量学习分层生成框架
|
|
GODS: Generalized One-Class Discriminative Subspaces for Anomaly Detection |
GODs:广义一类判别子空间用于异常检测
|
|
Neighborhood Preserving Hashing for Scalable Video Retrieval |
可分级视频检索中的邻域保持哈希算法
|
|
Self-Training With Progressive Augmentation for Unsupervised Cross-Domain Person Re-Identification |
无监督跨域人再识别的渐进增强自训练
|
|
SCRDet: Towards More Robust Detection for Small, Cluttered and Rotated Objects |
SCRDet:对小的、杂乱的和旋转的物体进行更稳健的检测
|
|
Cross-X Learning for Fine-Grained Visual Categorization |
基于Cross-X学习的细粒度视觉分类
|
|
Maximum-Margin Hamming Hashing |
最大边缘汉明散列
|
|
Conservative Wasserstein Training for Pose Estimation |
基于保守Wasserstein训练的姿势估计
|
|
Learning to Rank Proposals for Object Detection |
基于排序建议学习的目标检测
|
|
Vehicle Re-Identification With Viewpoint-Aware Metric Learning |
基于视点感知度量学习的车辆再识别
|
|
WSOD2: Learning Bottom-Up and Top-Down Objectness Distillation for Weakly-Supervised Object Detection |
WSPD2:基于自下而上和自上而下对象蒸馏学习的弱监督对象检测
|
|
Localization of Deep Inpainting Using High-Pass Fully Convolutional Network |
基于高通全卷积网络的深度修补定位
|
|
Clustered Object Detection in Aerial Images |
航空图像中的簇状目标检测
|
|
Unsupervised Graph Association for Person Re-Identification |
基于无监督图关联的人再识别
|
|
Learning a Mixture of Granularity-Specific Experts for Fine-Grained Categorization |
基于粒度特定专家混合学习的细粒度分类
|
|
advPattern: Physical-World Attacks on Deep Person Re-Identification via Adversarially Transformable Patterns |
advPattern:通过对抗转换模式实现对人再识别进行物理世界攻击
|
|
ABD-Net: Attentive but Diverse Person Re-Identification |
ABD-Net:专注但多元的人再识别
|
|
From Open Set to Closed Set: Counting Objects by Spatial Divide-and-Conquer |
从开集到闭集:基于空间分治的对象计数
|
|
Towards Precise End-to-End Weakly Supervised Object Detection Network |
精确的端到端弱监督目标检测网络
|
|
Learn to Scale: Generating Multipolar Normalized Density Maps for Crowd Counting |
学习缩放:生成用于人群计数的多极归一化密度图
|
|
Ground-to-Aerial Image Geo-Localization With a Hard Exemplar Reweighting Triplet Loss |
具有难样本重加权三元损失的地-空图像地理定位
|
|
Learning to Discover Novel Visual Categories via Deep Transfer Clustering |
通过深度转移聚类学习发现新的视觉类别
|
|
AM-LFS: AutoML for Loss Function Search |
AM-LFS:用于损失函数搜索的AutoML
|
|
Few-Shot Object Detection via Feature Reweighting |
基于特征重加权的少镜头目标检测
|
|
Objects365: A Large-Scale, High-Quality Dataset for Object Detection |
Objects365:用于目标检测的大规模高质量数据集
|
|
Efficient and Accurate Arbitrary-Shaped Text Detection With Pixel Aggregation Network |
基于像素聚集网络的任意形状文本检测
|
|
Foreground-Aware Pyramid Reconstruction for Alignment-Free Occluded Person Re-Identification |
基于前景感知金字塔重建的无对齐遮挡人再识别
|
|
Collect and Select: Semantic Alignment Metric Learning for Few-Shot Learning |
收集和选择:用于少镜头学习的语义对齐度量学习
|
|
Bayesian Adaptive Superpixel Segmentation |
贝叶斯自适应超像素分割
|
|
CapsuleVOS: Semi-Supervised Video Object Segmentation Using Capsule Routing |
CapsuleVOS:基于胶囊路由的半监督视频对象分割
|
|
BAE-NET: Branched Autoencoder for Shape Co-Segmentation |
BAE-NET:基于分支自动编码器的形状共分割
|
|
VV-Net: Voxel VAE Net With Group Convolutions for Point Cloud Segmentation |
VV网:基于组卷积的体素VAE网用于点云分割
|
|
Miss Detection vs. False Alarm: Adversarial Learning for Small Object Segmentation in Infrared Images |
漏检与虚警:红外图像小目标分割的对抗学习
|
|
Group-Wise Deep Object Co-Segmentation With Co-Attention Recurrent Neural Network |
基于共注意力递归神经网络的组深度目标共分割
|
|
Human Attention in Image Captioning: Dataset and Analysis |
图像标注中的人注意:数据集与分析
|
|
Variational Uncalibrated Photometric Stereo Under General Lighting |
一般光照下变分非定标光度立体
|
|
SPLINE-Net: Sparse Photometric Stereo Through Lighting Interpolation and Normal Estimation Networks |
SPLINE网:通过光插值和法向估计网络的稀疏光度立体
|
|
Hyperspectral Image Reconstruction Using Deep External and Internal Learning |
基于内、外深度学习的高光谱图像重建
|
|
Gravity as a Reference for Estimating a Person's Height From Video |
参考重力实现视频中估计身高的 |
|
Shadow Removal via Shadow Image Decomposition |
基于阴影图像分解的阴影去除
|
|
OperatorNet: Recovering 3D Shapes From Difference Operators |
OperatorNet:从差分运算符恢复三维形状
|
|
Neural Inverse Rendering of an Indoor Scene From a Single Image |
单幅图像的室内场景神经逆绘制
|
|
ForkNet: Multi-Branch Volumetric Semantic Completion From a Single Depth Image |
ForkNet:单深度图像的多分支体积语义补全
|
|
Moving Indoor: Unsupervised Video Depth Learning in Challenging Environments |
室内移动:挑战环境下的无监督视频深度学习
|
|
GraphX-Convolution for Point Cloud Deformation in 2D-to-3D Conversion |
基于GraphX卷积的二维到三维转换中点云变形
|
|
FrameNet: Learning Local Canonical Frames of 3D Surfaces From a Single RGB Image |
FrameNet:从单个RGB图像中学习三维曲面的局部规范框架
|
|
Holistic++ Scene Understanding: Single-View 3D Holistic Scene Parsing and Human Pose Estimation With Human-Object Interaction and Physical Commonsense |
Holistic++场景理解:单视图三维整体场景解析和基于人-物交互和物理常识的人体姿态估计
|
|
MMAct: A Large-Scale Dataset for Cross Modal Human Action Understanding |
MMAct:用于跨模态人类行为理解的大规模数据集
|
|
HACS: Human Action Clips and Segments Dataset for Recognition and Temporal Localization |
HACS:用于识别和时间定位的人类动作片段和分割数据集
|
|
3C-Net: Category Count and Center Loss for Weakly-Supervised Action Localization |
3C-Net:基于类别计数和中心损失的弱监督动作定位
|
|
Grounded Human-Object Interaction Hotspots From Video |
视频中固定的人-机交互热点
|
|
Hallucinating IDT Descriptors and I3D Optical Flow Features for Action Recognition With CNNs |
利用幻觉IDT描述子和I3D光流特征实现基于CNNs的动作识别
|
|
Learning to Paint With Model-Based Deep Reinforcement Learning |
基于模型的深度强化学习在绘画中的应用 |
|
Neural Re-Simulation for Generating Bounces in Single Images |
基于神经网络再模拟的单幅图像反弹生成
|
|
Deep Appearance Maps |
深度外观图 |
|
GarNet: A Two-Stream Network for Fast and Accurate 3D Cloth Draping |
GarNet:一种快速准确的三维布料覆盖的双流网络
|
|
Joint Embedding of 3D Scan and CAD Objects |
三维扫描与CAD对象的联合嵌入
|
|
CompoNet: Learning to Generate the Unseen by Part Synthesis and Composition |
CompoNet:通过部分合成和组合学习生成看不见的部分
|
|
DDSL: Deep Differentiable Simplex Layer for Learning Geometric Signals |
DDSL:基于深度可微单纯形层的几何信号学习
|
|
EGNet: Edge Guidance Network for Salient Object Detection |
EGNet:用于显著目标检测的边缘引导网络
|
|
SID4VAM: A Benchmark Dataset With Synthetic Images for Visual Attention Modeling |
SID4VAM:用于视觉注意建模的合成图像基准数据集
|
|
Two-Stream Action Recognition-Oriented Video Super-Resolution |
面向双流动作识别的视频超分辨率
|
|
Where Is My Mirror? |
我的镜子在哪里? |
|
Disentangled Image Matting |
分离图像抠图 |
|
Guided Super-Resolution As Pixel-to-Pixel Transformation |
通过像素到像素转换引导超分辨率
|
|
Deep Learning for Light Field Saliency Detection |
光场显著性检测的深度学习
|
|
Optimizing the F-Measure for Threshold-Free Salient Object Detection |
基于F-测度优化的无阈值显著目标检测
|
|
Image Inpainting With Learnable Bidirectional Attention Maps |
基于可学习双向注意图的图像修复
|
|
Joint Demosaicking and Denoising by Fine-Tuning of Bursts of Raw Images |
通过对原始图像序列的微调实现联合去马赛克和去噪
|
|
DeblurGAN-v2: Deblurring (Orders-of-Magnitude) Faster and Better |
BeblurGAN-v2:去模糊(数量级)更快更好 |
|
Reflective Decoding Network for Image Captioning |
基于反射解码网络的图像标注
|
|
Joint Optimization for Cooperative Image Captioning |
协同图像标注的联合优化
|
|
Watch, Listen and Tell: Multi-Modal Weakly Supervised Dense Event Captioning |
看、听、说:多模弱监督密集事件标注
|
|
Joint Syntax Representation Learning and Visual Cue Translation for Video Captioning |
基于联合句法表示学习与视觉线索翻译的视频标注
|
|
Entangled Transformer for Image Captioning |
基于纠缠变换的图像标注
|
|
Shapeglot: Learning Language for Shape Differentiation |
Shapeglot:基于语言学习的形态分化
|
|
nocaps: novel object captioning at scale |
nocaps:尺度上的新对象标注
|
|
Fully Convolutional Geometric Features |
完全卷积几何特征 |
|
Learning Local RGB-to-CAD Correspondences for Object Pose Estimation |
基于局部RGB-CAD的对应关系学习的目标姿态估计
|
|
Depth From Videos in the Wild: Unsupervised Monocular Depth Learning From Unknown Cameras |
野外视频的深度:未知摄像机的无监督单目深度学习
|
|
OmniMVS: End-to-End Learning for Omnidirectional Stereo Matching |
OmniMVS:全方位立体匹配的端到端学习
|
多视角的立体匹配 |
On the Over-Smoothing Problem of CNN Based Disparity Estimation |
基于CNN的视差估计的过平滑问题
|
|
Disentangling Propagation and Generation for Video Prediction |
视频预测的分离传播与生成
|
|
Guided Image-to-Image Translation With Bi-Directional Feature Transformation |
基于双向特征变换的图像-图像的转换
|
|
Towards Multi-Pose Guided Virtual Try-On Network |
面向多姿态引导的虚拟Try-On网络
|
|
Photorealistic Style Transfer via Wavelet Transforms |
基于小波变换的真实感风格转换
|
|
Personalized Fashion Design |
个性化服装设计 |
|
Tag2Pix: Line Art Colorization Using Text Tag With SECat and Changing Loss |
Tag2Pix:使用带有SECat和Changing损失的文本标记进行线条艺术着色
|
|
Free-Form Video Inpainting With 3D Gated Convolution and Temporal PatchGAN |
基于三维门控卷积和时域PatchGAN的自由形式视频修补
|
|
TextDragon: An End-to-End Framework for Arbitrary Shaped Text Spotting |
TextDragon:用于任意形状文本定位的端到端框架
|
|
Chinese Street View Text: Large-Scale Chinese Text Reading With Partially Supervised Learning |
中文街景文本:基于部分监督学习的大规模中文文本阅读
|
|
Deep Floor Plan Recognition Using a Multi-Task Network With Room-Boundary-Guided Attention |
房间边界引导注意的多任务网络实现深度楼层平面识别
|
|
GA-DAN: Geometry-Aware Domain Adaptation Network for Scene Text Detection and Recognition |
GA-DAN:用于场景文本检测和识别的几何感知域自适应网络
|
|
Large-Scale Tag-Based Font Retrieval With Generative Feature Learning |
基于生成特征学习的大规模标签字体检索
|
|
Convolutional Character Networks |
卷积字符网络 |
|
Geometry Normalization Networks for Accurate Scene Text Detection |
用于精确场景文本检测的几何规范化网络
|
|
Symmetry-Constrained Rectification Network for Scene Text Recognition |
对称约束校正网络在场景文本识别中的应用 |
|
YOLACT: Real-Time Instance Segmentation |
YOLACT:实时实例分割
|
见图2,先分割出对象BB,再进行像素级实例分割 |
Expectation-Maximization Attention Networks for Semantic Segmentation |
基于期望最大化注意力网络的语义分割
|
如图2,将EM算法的思想和迭代过程,嵌入到深度网络中,目的是替代自监督Attention过程(无需访问所有数据,较Non-Local更为灵活,且可以提升速度) |
Multi-Class Part Parsing With Joint Boundary-Semantic Awareness |
基于联合边界语义感知的多类部分解析
|
|
Explaining Neural Networks Semantically and Quantitatively |
神经网络的语义和定量地解释
|
|
PANet: Few-Shot Image Semantic Segmentation With Prototype Alignment |
PANet:基于原型对齐的少镜头图像语义分割
|
|
ShapeMask: Learning to Segment Novel Objects by Refining Shape Priors |
ShapeMask:通过精化形状先验学习分割新对象
|
|
Sequence Level Semantics Aggregation for Video Object Detection |
基于序列级语义聚合的视频对象检测
|
|
Video Object Segmentation Using Space-Time Memory Networks |
基于时空存储网络的视频对象分割
|
|
Zero-Shot Video Object Segmentation via Attentive Graph Neural Networks |
基于注意力图神经网络的零镜头视频对象分割
|
|
MeteorNet: Deep Learning on Dynamic 3D Point Cloud Sequences |
MeteorNet:动态三维点云序列的深度学习
|
|
3D Instance Segmentation via Multi-Task Metric Learning |
基于多任务度量学习的三维实例分割
|
|
DeepGCNs: Can GCNs Go As Deep As CNNs? |
DeepGCN:GCN能像CNN一样深吗? |
|
Deep Hough Voting for 3D Object Detection in Point Clouds |
点云中基于深度Hough投票的三维目标检测
|
|
M3D-RPN: Monocular 3D Region Proposal Network for Object Detection |
M3D-RPN:用于目标检测的单目3D区域建议网络
|
|
SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences |
semanticKITTI:用于激光雷达序列语义场景理解的数据集
|
|
WoodScape: A Multi-Task, Multi-Camera Fisheye Dataset for Autonomous Driving |
WoodScape:一个用于自动驾驶的多任务、多摄像机鱼眼数据集
|
|
Scalable Place Recognition Under Appearance Change for Autonomous Driving |
面向自主驾驶的外观变化下的可扩展位置识别
|
|
Exploring the Limitations of Behavior Cloning for Autonomous Driving |
探索在自主驾驶中行为克隆的局限性
|
|
Habitat: A Platform for Embodied AI Research |
Habitat:体现人工智能研究的平台
|
|
Towards Interpretable Face Recognition |
面向可解释的人脸识别
|
|
Co-Mining: Deep Face Recognition With Noisy Labels |
联合挖掘:带噪声标签的深度人脸识别
|
|
Few-Shot Adaptive Gaze Estimation |
少镜头自适应注视估计 |
|
Live Face De-Identification in Video |
视频中的实时人脸反识别
|
|
Face Video Deblurring Using 3D Facial Priors |
基于三维人脸先验的视频人脸去模糊
|
|
Semi-Supervised Monocular 3D Face Reconstruction With End-to-End Shape-Preserved Domain Transfer |
基于端到端形状保持域转移的半监督单目三维人脸重建
|
|
3D Face Modeling From Diverse Raw Scan Data |
基于多样的原始扫描数据的三维人脸建模
|
|
A Decoupled 3D Facial Shape Model by Adversarial Training |
一种基于对抗训练的去耦三维人脸形状模型
|
|
Photo-Realistic Facial Details Synthesis From Single Image |
基于单幅图像的真实感人脸细节合成
|
|
S2GAN: Share Aging Factors Across Ages and Share Aging Trends Among Individuals |
S2GAN:在各个年龄段共享老化因素,在个人间共享老化趋势
|
|
PuppetGAN: Cross-Domain Image Manipulation by Demonstration |
PuppetGAN:基于演示的跨域图像操作
|
|
Few-Shot Adversarial Learning of Realistic Neural Talking Head Models |
真实神经说话头部模型的少镜头对抗学习
|
|
Pose-Aware Multi-Level Feature Network for Human Object Interaction Detection |
基于位姿感知的多层次特征网络的人机交互检测
|
|
TRB: A Novel Triplet Representation for Understanding 2D Human Body |
TRB:一种新的三元表示实现二维人体的理解
|
|
Learning Trajectory Dependencies for Human Motion Prediction |
用于人体运动预测的轨迹依赖学习
|
|
Cross-Domain Adaptation for Animal Pose Estimation |
基于跨域自适应的动物姿态估计
|
|
NOTE-RCNN: NOise Tolerant Ensemble RCNN for Semi-Supervised Object Detection |
基于噪声容限集成RCNN的半监督目标检测
|
|
Unsupervised Out-of-Distribution Detection by Maximum Classifier Discrepancy |
基于Maximum Classifier差异的无监督分布外检测
|
|
SBSGAN: Suppression of Inter-Domain Background Shift for Person Re-Identification |
SBSGAN:基于域间背景漂移抑制的人再识别
|
|
Enriched Feature Guided Refinement Network for Object Detection |
基于丰富特征引导细化网络的目标检测
|
|
Deep Meta Metric Learning |
深度元测量学习 |
|
Discriminative Feature Transformation for Occluded Pedestrian Detection |
基于判别特征变换的遮挡行人检测
|
|
Contextual Attention for Hand Detection in the Wild |
上下文注意在野外手部检测中的应用 |
|
Meta R-CNN: Towards General Solver for Instance-Level Low-Shot Learning |
元R-CNN:面向一般解算器的实例级少镜头学习
|
|
Pyramid Graph Networks With Connection Attentions for Region-Based One-Shot Semantic Segmentation |
基于连接注意的金字塔图网络实现基于区域的单镜头语义分割
|
|
Presence-Only Geographical Priors for Fine-Grained Image Classification |
用于细粒度图像分类的仅存在地理先验
|
|
POD: Practical Object Detection With Scale-Sensitive Network |
基于尺度敏感网络的实用目标检测
|
|
Human Uncertainty Makes Classification More Robust |
人类的不确定性使得分类更加可靠
|
|
FCOS: Fully Convolutional One-Stage Object Detection |
全卷积单级目标检测
|
|
Self-Critical Attention Learning for Person Re-Identification |
自我批判性注意力学习用于人再识别
|
|
Temporal Knowledge Propagation for Image-to-Video Person Re-Identification |
基于时间知识传播的图像-视频人再识别
|
|
RepPoints: Point Set Representation for Object Detection |
RepPoints:用于目标检测的点集表示
|
|
SegEQA: Video Segmentation Based Visual Attention for Embodied Question Answering |
SegEQA:一种基于视频分割的视觉注意力在具体问答中的应用 |
|
No-Frills Human-Object Interaction Detection: Factorization, Layout Encodings, and Training Techniques |
无装饰的人机交互检测:因子分解、布局编码和训练技术
|
|
Cap2Det: Learning to Amplify Weak Caption Supervision for Object Detection |
Cap2Det:学习增强弱字幕监控以实现目标检测
|
|
No Fear of the Dark: Image Retrieval Under Varying Illumination Conditions |
不怕黑暗:不同光照条件下的图像检索
|
|
Hierarchical Shot Detector |
分层镜头检测器
|
|
Few-Shot Learning With Global Class Representations |
基于全局类表示的少镜头学习
|
|
Better to Follow, Follow to Be Better: Towards Precise Supervision of Feature Super-Resolution for Small Object Detection |
更好跟随,跟随更好:小目标检测中特征超分辨率的精确监控
|
|
Weakly Supervised Object Detection With Segmentation Collaboration |
基于分割协作的弱监督目标检测
|
|
AutoFocus: Efficient Multi-Scale Inference |
自动聚焦:有效的多尺度推理
|
|
Leveraging Long-Range Temporal Relationships Between Proposals for Video Object Detection |
基于方案之间的长范围时间关系的视频对象检测
|
|
Transferable Contrastive Network for Generalized Zero-Shot Learning |
基于可转移对比网络的广义零镜头学习
|
|
Fast Point R-CNN |
快速点R-CNN |
|
Mesh R-CNN |
网状R-CNN |
|
Deep Supervised Hashing With Anchor Graph |
基于锚图的深度监督哈希算法
|
|
Detecting 11K Classes: Large Scale Object Detection Without Fine-Grained Bounding Boxes |
11k类别检测:无细粒度包围盒的大规模目标检测
|
|
Re-ID Driven Localization Refinement for Person Search |
再识别驱动的定位精化实现人搜索
|
|
Hierarchical Encoding of Sequential Data With Compact and Sub-Linear Storage Cost |
基于压缩和次线性存储代价的序列数据分层编码
|
|
C-MIDN: Coupled Multiple Instance Detection Network With Segmentation Guidance for Weakly Supervised Object Detection |
C-MIDN:带分割指导的耦合多实例检测网络实现弱监督目标检测
|
|
Learning Feature-to-Feature Translator by Alternating Back-Propagation for Generative Zero-Shot Learning |
基于交替反向传播的特征-特征转换学习实现零镜头学习
|
|
Deep Constrained Dominant Sets for Person Re-Identification |
用于人再识别的深度约束支配集
|
|
Invariant Information Clustering for Unsupervised Image Classification and Segmentation |
基于不变信息聚类的无监督图像分类与分割
|
|
Subspace Structure-Aware Spectral Clustering for Robust Subspace Clustering |
子空间结构感知谱聚类在鲁棒子空间聚类中的应用 |
|
Order-Preserving Wasserstein Discriminant Analysis |
保序Wasserstein判别分析
|
|
LayoutVAE: Stochastic Scene Layout Generation From a Label Set |
LayoutVAE:从标签集生成随机场景布局
|
|
Robust Variational Bayesian Point Set Registration |
鲁棒变分贝叶斯点集配准
|
|
Is an Affine Constraint Needed for Affine Subspace Clustering? |
仿射子空间聚类需要仿射约束吗? |
|
Meta-Learning to Detect Rare Objects |
检测稀有物体的元学习
|
|
New Convex Relaxations for MRF Inference With Unknown Graphs |
新凸松弛实现未知图MRF推理
|
|
Cluster Alignment With a Teacher for Unsupervised Domain Adaptation |
基于教师的聚类对齐实现无监督域自适应
|
|
Analyzing the Variety Loss in the Context of Probabilistic Trajectory Prediction |
概率轨迹预测上下文中的变化损失分析
|
|
Deep Mesh Reconstruction From Single RGB Images via Topology Modification Networks |
基于拓扑修正网络的单一RGB图像深度网格重建
|
|
UprightNet: Geometry-Aware Camera Orientation Estimation From Single Images |
UprightNet:基于单帧图像的几何感知摄像机方位估计
|
|
Escaping Plato's Cave: 3D Shape From Adversarial Rendering |
逃离柏拉图的洞穴:基于对抗性渲染的三维形态
|
|
Deep End-to-End Alignment and Refinement for Time-of-Flight RGB-D Module |
基于深度端到端对齐与细化的Time-of-Flight RGB-D模块
|
|
GEOBIT: A Geodesic-Based Binary Descriptor Invariant to Non-Rigid Deformations for RGB-D Images |
GEOBIT:一种对RGB-D图像非刚性变形保持不变的基于测地线的二值描述子
|
|
CDTB: A Color and Depth Visual Object Tracking Dataset and Benchmark |
彩色和深度视觉目标跟踪数据集与基准
|
|
Learning Joint 2D-3D Representations for Depth Completion |
基于二维-三维联合表示学习的深度补全
|
|
Make a Face: Towards Arbitrary High Fidelity Face Manipulation |
做一张脸:朝向任意高保真的脸操作
|
|
M2FPA: A Multi-Yaw Multi-Pitch High-Quality Dataset and Benchmark for Facial Pose Analysis |
M2FPA:一个用于面部姿势分析的多偏航多俯仰高质量数据集和基准
|
|
Fair Loss: Margin-Aware Reinforcement Learning for Deep Face Recognition |
公平损失:面向深度人脸识别的边缘感知强化学习
|
|
Face De-Occlusion Using 3D Morphable Model and Generative Adversarial Network |
基于三维变形模型和GAN的人脸去遮挡
|
|
Detecting Photoshopped Faces by Scripting Photoshop |
用photoshop脚本检测photoshop人脸
|
|
Ego-Pose Estimation and Forecasting As Real-Time PD Control |
作为实时PD控制中的自位姿估计与预测
|
|
End-to-End Learning for Graph Decomposition |
图分解的端到端学习
|
|
Laplace Landmark Localization |
拉普拉斯地标定位
|
|
Through-Wall Human Mesh Recovery Using Radio Signals |
利用无线电信号进行穿墙人体网格恢复
|
|
Discriminatively Learned Convex Models for Set Based Face Recognition |
凸模型判别学习实现基于集的人脸识别
|
|
Camera Distance-Aware Top-Down Approach for 3D Multi-Person Pose Estimation From a Single RGB Image |
单一RGB图像中摄像机距离感知自顶向下方法实现三维多人姿态估计
|
|
Context-Aware Emotion Recognition Networks |
基于上下文感知网络的情感识别
|
|
Aggregation via Separation: Boosting Facial Landmark Detector With Semi-Supervised Style Translation |
基于分离的聚合:基于半监督风格平移的人脸标志检测增强
|
|
Deep Head Pose Estimation Using Synthetic Images and Partial Adversarial Domain Adaption for Continuous Label Spaces |
基于合成图像和连续标签空间部分对抗域自适应的深部头部姿态估计
|
|
Flare in Interference-Based Hyperspectral Cameras |
基于干涉的高光谱相机中的耀斑
|
|
Computational Hyperspectral Imaging Based on Dimension-Discriminative Low-Rank Tensor Recovery |
基于维数-判别低秩张量恢复的计算高光谱成像
|
|
Deep Optics for Monocular Depth Estimation and 3D Object Detection |
基于深度光学的单目深度估计和三维目标检测
|
|
Physics-Based Rendering for Improving Robustness to Rain |
基于物理的绘制提高了对雨水的鲁棒性
|
|
ARGAN: Attentive Recurrent Generative Adversarial Network for Shadow Detection and Removal |
ARGAN:用于阴影检测和消除的注意力循环生成对抗网络
|
|
Deep Tensor ADMM-Net for Snapshot Compressive Imaging |
用于快照压缩成像的深度张量ADMM网
|
|
Convex Relaxations for Consensus and Non-Minimal Problems in 3D Vision |
利用凸松弛解决三维视觉中一致性和非极小问题
|
|
Pareto Meets Huber: Efficiently Avoiding Poor Minima in Robust Estimation |
Pareto Meets Huber:稳健估计中有效避免弱极小
|
|
K-Best Transformation Synchronization |
K-最佳变换同步 |
|
Parametric Majorization for Data-Driven Energy Minimization Methods |
数据驱动能量最小化方法的参数优化
|
|
A Bayesian Optimization Framework for Neural Network Compression |
基于贝叶斯优化框架的神经网络压缩
|
|
HiPPI: Higher-Order Projected Power Iterations for Scalable Multi-Matching |
HiPPI:基于高阶投影功率迭代的可伸缩多匹配
|
|
Language-Conditioned Graph Networks for Relational Reasoning |
基于语言-条件图网络的关系推理
|
|
Tell, Draw, and Repeat: Generating and Modifying Images Based on Continual Linguistic Instruction |
讲、画、重复:基于连续语言指导的图像生成与修改
|
|
Relation-Aware Graph Attention Network for Visual Question Answering |
基于关系感知图注意网络的视觉问答
|
|
Unpaired Image Captioning via Scene Graph Alignments |
基于场景图对齐的未配对图像标注
|
|
Modeling Inter and Intra-Class Relations in the Triplet Loss for Zero-Shot Learning |
三元损失中类间和类内关系建模实现零镜头学习
|
|
Occlusion-Shared and Feature-Separated Network for Occlusion Relationship Reasoning |
基于遮挡共享和特征分离网络的遮挡关系推理
|
|
Compositional Video Prediction |
合成视频预测 |
|
Mixture-Kernel Graph Attention Network for Situation Recognition |
基于混合核图注意网络的态势识别
|
|
Learning Similarity Conditions Without Explicit Supervision |
没有明确的监督下学习相似条件
|
|
Joint Prediction for Kinematic Trajectories in Vehicle-Pedestrian-Mixed Scenes |
车-人-混合场景中运动轨迹的联合预测
|
|
Learning to Caption Images Through a Lifetime by Asking Questions |
通过提问来学会在一生中给图片加标注
|
|
VrR-VG: Refocusing Visually-Relevant Relationships |
VrR-VG:重新聚焦视觉-相关关系
|
|
TAPA-MVS: Textureless-Aware PAtchMatch Multi-View Stereo |
TAPA-MVS:无纹理感知PatchMatch多视图立体
|
多视图立体匹配 |
U4D: Unsupervised 4D Dynamic Scene Understanding |
U4D:无监督4D动态场景理解
|
|
Hierarchical Point-Edge Interaction Network for Point Cloud Semantic Segmentation |
基于层次点-边交互网络的点云语义分割
|
|
Multi-Angle Point Cloud-VAE: Unsupervised Feature Learning for 3D Point Clouds From Multiple Angles by Joint Self-Reconstruction and Half-to-Half Prediction |
多角度点云VAE:基于联合自重构和半对半预测的多角度三维点云无监督特征学习
|
|
P-MVSNet: Learning Patch-Wise Matching Confidence Aggregation for Multi-View Stereo |
P-MVSNet:学习多视点立体视觉的逐块匹配置信聚集
|
多视图立体匹配 |
SME-Net: Sparse Motion Estimation for Parametric Video Prediction Through Reinforcement Learning |
基于强化学习的稀疏运动估计实现参数化视频预测
|
|
ClothFlow: A Flow-Based Model for Clothed Person Generation |
ClothFlow:一种基于流的人穿衣生成模型
|
|
LADN: Local Adversarial Disentangling Network for Facial Makeup and De-Makeup |
LADN:用于面部化妆和卸妆的局部对抗分离网络
|
|
Point-to-Point Video Generation |
点对点视频生成
|
|
Semantics-Enhanced Adversarial Nets for Text-to-Image Synthesis |
基于语义增强对抗网的文本-图像合成
|
|
VTNFP: An Image-Based Virtual Try-On Network With Body and Clothing Feature Preservation |
VTNFP:一种身体和衣服特征保持的基于图像的虚拟试穿网络
|
|
Boundless: Generative Adversarial Networks for Image Extension |
Boundless:基于GAN的图像扩展
|
|
Image Synthesis From Reconfigurable Layout and Style |
基于可重构布局和风格的图像合成
|
|
Attribute Manipulation Generative Adversarial Networks for Fashion Images |
基于属性操作GAN的时尚图像
|
|
Few-Shot Unsupervised Image-to-Image Translation |
少镜头无监督图像-图像的转换
|
|
Very Long Natural Scenery Image Prediction by Outpainting |
利用Outpainting实现超长自然景物图像预测
|
|
Scaling Recurrent Models via Orthogonal Approximations in Tensor Trains |
张量训练中利用正交逼近实现递推模型分级
|
|
A Deep Cybersickness Predictor Based on Brain Signal Analysis for Virtual Reality Contents |
虚拟现实内容中基于脑信号分析的深度晕机预测
|
|
Learning With Unsure Data for Medical Image Diagnosis |
医学影像诊断中的不确定性数据学习
|
|
Recursive Cascaded Networks for Unsupervised Medical Image Registration |
基于递归级联网络的无监督医学图像配准
|
|
DUAL-GLOW: Conditional Flow-Based Generative Model for Modality Transfer |
DUAL-GLOW:基于条件流的生成模型实现模态转换
|
|
Dilated Convolutional Neural Networks for Sequential Manifold-Valued Data |
扩张卷积神经网络用于序列流形-值数据
|
|
Align, Attend and Locate: Chest X-Ray Diagnosis via Contrast Induced Attention Network With Limited Supervision |
对齐、出席和定位:有限监督下通过造影诱导注意网络进行胸部x线诊断
|
|
Joint Acne Image Grading and Counting via Label Distribution Learning |
基于标签分布学习的痤疮图像联合分级与计数
|
|
An Alarm System for Segmentation Algorithm Based on Shape Model |
基于形状模型的分割算法报警系统
|
|
HistoSegNet: Semantic Segmentation of Histological Tissue Type in Whole Slide Images |
HistoSegNet:全幻灯片图像组织类型的语义分割
|
|
Prior-Aware Neural Network for Partially-Supervised Multi-Organ Segmentation |
基于先验感知神经网络的部分监督多器官分割
|
|
CAMEL: A Weakly Supervised Learning Framework for Histopathology Image Segmentation |
CAMEL:组织病理学图像分割的弱监督学习框架
|
|
Conditional Recurrent Flow: Conditional Generation of Longitudinal Samples With Applications to Neuroimaging |
条件返流:纵向样本的条件生成及其在神经影像学中的应用
|
|
Multi-Stage Pathological Image Classification Using Semantic Segmentation |
基于语义分割的多阶段病理图像分类
|
|
Semantic-Transferable Weakly-Supervised Endoscopic Lesions Segmentation |
语义可转移的弱监督内镜病变分割
|
|
Unsupervised Microvascular Image Segmentation Using an Active Contours Mimicking Neural Network |
基于活动轮廓模拟神经网络的无监督微血管图像分割
|
|
GLAMpoints: Greedily Learned Accurate Match Points |
GLAMpoints:贪婪地学习精确的匹配点
|
|