Open AI 自监督学习笔记：Self-Supervised Learning

2023-10-27

转载自微信公众号
原文链接： https://mp.weixin.qq.com/s?__biz=Mzg4MjgxMjgyMg==&mid=2247486049&idx=1&sn=1d98375dcbb9d0d68e8733f2dd0a2d40&chksm=cf51b898f826318ead24e414144235cfd516af4abb71190aeca42b1082bd606df6973eb963f0#rd

Open AI 自监督学习笔记

文章目录

- Open AI 自监督学习笔记

Video: https://www.youtube.com/watch?v=7l6fttRJzeU
Slides: https://nips.cc/media/neurips-2021/Slides/21895.pdf

Self-Supervised Learning
– Self-Prediction and Contrastive Learning

Self-Supervised Learning
- a popular paradigm of representation learning

Outline

Introduction: motivation, basic concepts, examples
Early Work: Look into connection with old methods
Methods
- Self-prediction
- Contrastive Learning
- (for each subsection, present the framework and categorization)
Pretext tasks: a wide range of literature review
Techniques: improve training efficiency

Introduction

What is self-supervised learning and why we need it?

What is self-supervised learning?

Self-supervised learning (SSL):
- a special type of representation learning that enables learning good data representation from unlablled dataset
Motivation :
- the idea of constructing supervised learning tasks out of unsupervised datasets
- Why?
  
  ✅ Data labeling is expensive and thus high-quality dataset is limited
  
  ✅ Learning good representation makes it easier to transfer useful information to a variety of downstream tasks ⇒ \Rightarrow ⇒ e.g. Few-shot learning / Zero-shot transfer to new tasks

Self-supervised learning tasks are also known as pretext tasks

What’s Possible with Self-Supervised Learning?

Video Colorization (Vondrick et al 2018)
- a self-supervised learning method
- resulting in a rich representation
- can be used for video segmentation + unlabelled visual region tracking, without extra fine-tuning
- just label the first frame
Zero-shot CLIP (Radford et al. 2021)
- Despite of not training on supervised labels
- Zero-shot CLIP classifier achieve great performance on challenging image-to-text classification tasks

Early Work

Precursors 先驱者 to recent self-supervised approaches

Early Work: Connecting the Dots

Some ideas:

Restricted Boltzmann Machines
Autoencoders
Word2Vec
Autogressive Modeling
Siamese networks
Multiple Instance / Metric Learning

Restricted Boltzmann Machines

RBM:
- a special case of markov random field
- consisting of visible units and hidden units
- has connections between any pair across visible and hidden units, but not within each group

Autoencoder: Self-Supervised Learning for Vision in Early Days

Autoencoder: a precursor to the modren self-supervised approaches
- Such as Denoising Autoencoder
Has inspired many self-learning approaches in later years
- such as masked language model (e.g. BERT), MAE

Word2Vec: Self-Supervised Learning for Language

Word Embeddings to map words to vectors
- extract the feature of words
idea:
- the sum of neighboring word embedding is predictive of the word in the middle

An interesting phenomenon resulting from word2Vec:
- you can observe linear substructures in the embedding space where the lines connecting comparable concepts such as the corresponding masculine and feminine words appear in roughly parallel lines

Autoregressive Modeling

Autoregressive model:
- Autoregressive (AR) models are a class of time series models in which the value at a given time step is modeled as a linear function of previous values
- NADE: Neural Autogressive Distribution Estimator
Autogressive model also has been a basis for many self-supervised methods such as gpt

Siamese Networks

Many contrastive self-supervised learning methods use a pair of neural networks and learned from their difference
– this idea can be tracked back to Siamese Networks

Self-organizing neural networks
- where two neural networks take seperate but related parts of the input, and learns to maximize the agreement between the two outputs
Siamese Networks
- if you believe that one network F can well encode x and get a good representation f(x)
- then, 对于两个不同的输入x1和x2，their distance can be d(x1,x2) = L(f(x1),f(x2))
- the idea of running two identical CNN on two different inputs and then comparing them —— a Siamese network
- Train by:
  
  ✅ If xi and xj are the same person, ∣ ∣ f ( x i ) − f ( x j ) ||f(xi)-f(xj) ∣∣f(xi)−f(xj) is small
  
  ✅ If xi and xj are the different person, ∣ ∣ f ( x i ) − f ( x j ) ||f(xi)-f(xj) ∣∣f(xi)−f(xj) is large

Multiple Instance Learning & Metric Learning

Predecessors of the predetestors of the recent contrastive learning techniques : multiple instance learning and metric learning

deviate frome the typical framework of empirical risk minimization
- define the objective function in terms of multiple samples from the dataset ⇒ \Rightarrow ⇒ multiple instance learning
ealy work:
- around non-linear dimensionality reduction
- 如multi-dimensional scaling and locally linear embedding
- better than PCA: can preserving the local structure of data samples
metric learning:
- x and y: two samples
- A: A learnable positive semi-definite matrix
contrastive Loss:
- use a spring system to decrease the distance between the same types of inputs, and increase between different type of inputs
Triplet loss
- another way to obtain a learned metric
- defined using 3 data points
- anchor, positive and negative
- the anchor point is learned to become similar to the positive, and dissimilar to the negative
N-pair loss:
- generalized triplet loss
- recent 对比学习就以 N-pair loss 为原型

Methods

self-prediction
Contrastive learning

Methods for Framing Self-Supervised Learning Tasks

Self-prediction: Given an individual data sample, the task is to predict one part of the sample given the other part
- 即 “Intra-sample” prediction

The part to be predicted pretends to be missing

Contrastive learning: Given multiple data samples, the task is to predict the relationship among them
- relationship: can be based on inner logics within data
  
  ✅ such as different camera views of the same scene
  
  ✅ or create multiple augmented version of the same sample

The multiple samples can be selected from the dataset based on some known logics (e.g., the order of words / sentences), or fabricated by altering the original version
即 we know the true relationship between samples but pretend to not know it

Self-Prediction

Self-prediction construct prediction tasks within every individual data sample
- to predict a part of the data from the rest while pretending we don’t know that part
- The following figure: demonstrate how flexible and diverse the options we have for constructing self-prediction learning tasks
  
  ✅ can mask any dimensions
分类：
- Autoregressive generation
- Masked generation
- Innate relationship prediction
- Hybrid self-prediction

Self-prediction: Autoregressive Generation

The autoregressive model predicts future behavior based on past behavior
- Any data that comes with an innate sequential order can be modeled with regression
Examples :
- Audio (WaveNet, WaveRNN)
- Autoregressive language modeling (GPT, XLNet)
- Images in raster scan (PixelCNN, PixelRNN, iGPT)

Self-Prediction: Masked Generation

mask a random portion of information and pretend it is missiing, irrespective of the natural sequence
- The model learns to predict the missing portion given other unmasked information
e.g.,
- predicting random words based on other words in the same context around it
Examples :
- Masked language modeling (BERT)
- Images with masked patch (denoising autoencoder, context autoencoder, colorization)

Self-Prediction: Innate Relationship Prediction

Some transformation (e.g., segmentation, rotation) of one data samples should maintain the original information of follow the desired innate logic
Examples
- Order of image patches
  
  ✅ e.g., shuffle the patches
  
  ✅ e.g., relative position, jigsaw puzzle
- Image rotation
- Counting features across patches

Self-Prediction: Hybrid Self-Prediction Models

Hybrid Self-Prediction Models: Combines different type of generation modeling

VQ-VAE + AR
- Jukebox (Dhariwal et al. 2020), DALL-E (Ramesh et al. 2021)
VQ-VAE + AR + Adversial
- VQGAN (Esser & Rombach et al. 2021)
- VQ-VAE: to learn a discrete code book of context rich visual parts
- A transformer model: trained to auto-aggressively modeling the color combination of this code book

Contrastive Learning

Goal:
- To learn such an embedding space in which similar sample pairs stay close to each other while dissimilar ones are far apart
对比学习 can be applied to both supervised and unsupervised settings
- when working with unsupervised data, 对比学习 is one of the most powerful approach in the self-supervised learning
Category
- Inter-sample classification

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)

Deep Learning

人工智能

机器学习

自监督学习

对比学习

Open AI 自监督学习笔记：Self-Supervised Learning 的相关文章

【卡尔曼滤波】粗略模型和过滤技术在模型不确定情况下的应用研究（Matlab代码实现）

欢迎来到本博客博主优势博客内容尽量做到思维缜密逻辑清晰为了方便读者座右铭行百里者半于九十本文目录如下目录 1 概述 2 运行结果 3 参考文献 4 Matlab代码及文献
【EI复现】基于深度强化学习的微能源网能量管理与优化策略研究（Python代码实现）

欢迎来到本博客博主优势博客内容尽量做到思维缜密逻辑清晰为了方便读者座右铭行百里者半于九十本文目录如下目录 1 概述 2 运行结果 2 1 有无策略奖励 2 2 训练结果1
【图像融合】基于联合双边滤波和局部梯度能量的多模态医学图像融合研究（Matlab代码实现）

欢迎来到本博客博主优势博客内容尽量做到思维缜密逻辑清晰为了方便读者座右铭行百里者半于九十本文目录如下目录 1 概述 2 运行结果 3 参考文献 4 Matlab代码图像文章
【路径规划】基于A*算法路径规划研究（Matlab代码实现）

欢迎来到本博客博主优势博客内容尽量做到思维缜密逻辑清晰为了方便读者座右铭行百里者半于九十本文目录如下目录 1 概述 2 运行结果 3 参考文献 4 Matlab代码实现
用CHAT写一份标题为职业教育教师教学能力提升培训总结

CHAT回复标题职业教育教师教学能力提升培训总结一活动概述本次由学校组织的职业教育教师教学能力提升培训于8月15日至8月20日顺利进行来自全校的60位职业教育教师参与了此次培训主讲人为享有盛名的教育专家马丁先生二培训内容与
socket网络编程几大模型？看看CHAT是如何回复的？

CHAT回复网络编程中常见的有以下几种模型 1 阻塞I O模型 Blocking I O 传统的同步I O模型一次只处理一个请求 2 非阻塞I O模型 Non blocking I O 应用程序轮询调用socket相关函数检查请求不需
基于java的ssh医院在线挂号系统设计与实现

基于java的ssh医院在线挂号系统设计与实现 I 引言 A 研究背景和动机基于Java的SSH医院在线挂号系统设计与实现的研究背景和动机随着信息技术的迅速发展和应用医院在线挂号系统已成为医院管理的重要组成部分传统的挂号方式存在许多
让CHAT介绍下V2ray

CHAT回复 V2Ray是一个网络工具主要用于科学上网和保护用户的网络安全它的名字源自Vmess Ray 光线通过使用新的网络协议为用户提供稳定且灵活的代理服务下面是一些V2Ray的主要特性 1 多协议支持 V2Ray 提供了大量
利用CHAT上传文件的操作

问CHAT autox js ui 上传框 CHAT回复上传文件的操作如果是在应用界面中的话由于Android对于文件权限的限制你可能不能直接模拟点击选择文件一般来说有两种常见的解决方案一种是使用intent来模拟发送一个文件路径
活动日程&直播预约｜智谱AI技术开放日 Zhipu DevDay

点击蓝字关注我们 AI TIME欢迎每一位AI爱好者的加入直播预约通道关于AI TIME AI TIME源起于2019年旨在发扬科学思辨精神邀请各界人士对人工智能理论算法和场景应用的本质问题进行探索加强思想碰撞链接全球AI学
2024 人工智能与大数据专业毕业设计(论文)选题指导

目录前言毕设选题选题迷茫选题的重要性更多选题指导最后前言大四是整个大学期间最忙碌的时光一边要忙着备考或实习为毕业后面临的就业升学做准备一边要为毕业设计耗费大量精力近几年各个学校要求的毕设项目越来越难有不少课题是研究生
如何快速申请GPT账号？

详情点击链接如何快速申请GPT账号一OpenAI 1 最新大模型GPT 4 Turbo 2 最新发布的高级数据分析 AI画图图像识别文档API 3 GPT Store 4 从0到1创建自己的GPT应用 5 模型Gemini以及大模型
用通俗易懂的方式讲解：使用 LlamaIndex 和 Eleasticsearch 进行大模型 RAG 检索增强生成

检索增强生成 Retrieval Augmented Generation RAG 是一种结合了检索 Retrieval 和生成 Generation 的技术它有效地解决了大语言模型 LLM 的一些问题比如幻觉知识限制等随着 RAG
不要再苦苦寻觅了！AI 大模型面试指南（含答案）的最全总结来了！

AI 大模型技术经过2023年的狂飙 2024年必将迎来应用的落地对 IT 同学来讲这里蕴含着大量的技术机会越来越多的企业开始招聘 AI 大模型岗位本文梳理了 AI 大模型开发技术的面试之道从 AI 大模型基础面 AI 大模型进阶
开始弃用NeRF？为什么Gaussian Splatting在自动驾驶场景如此受欢迎？（浙江大学最新）...

点击下方卡片关注自动驾驶之心公众号 ADAS巨卷干货即可获取今天自动驾驶之心为大家分享浙大刚刚出炉的3D Gaussian Splatting综述文章首先回顾了3D Gaussian的原理和应用借着全面比较了3D GS在静态
基于节点电价的电网对电动汽车接纳能力评估模型研究（Matlab代码实现）

欢迎来到本博客博主优势博客内容尽量做到思维缜密逻辑清晰为了方便读者座右铭行百里者半于九十本文目录如下目录 1 概述 2 运行结果 3 参考文献 4 Matlab代码数据
考虑光伏出力利用率的电动汽车充电站能量调度策略研究（Matlab代码实现）

欢迎来到本博客博主优势博客内容尽量做到思维缜密逻辑清晰为了方便读者座右铭行百里者半于九十本文目录如下目录 1 概述 2 运行结果 3 参考文献 4 Matlab代码数据
5_机械臂运动学基础_矩阵

上次说的向量空间是为矩阵服务的 1 学科回顾从科技实践中来的数学问题无非分为两类一类是线性问题一类是非线性问题线性问题是研究最久理论最完善的而非线性问题则可以在一定基础上转化为线性问题求解线性变换数域 F 上线性空间V中的变
深度学习(5)--Keras实战

一 Keras基础概念 Keras是深度学习中的一个神经网络框架是一个高级神经网络API 用Python编写可以在TensorFlow CNTK或Theano之上运行 Keras优点 1 允许简单快速的原型设计用户友好性模块化和可扩
Making Large Language Models Perform Better in Knowledge Graph Completion论文阅读

文章目录摘要 1 问题的提出引出当前研究的不足与问题 KGC方法 LLM幻觉现象解决方案 2 数据集和模型构建

随机推荐

Devs--开源规则引擎介绍

Devs Devs是一款轻量级的规则引擎开源地址 https github com CrankZ devs 基础概念此规则引擎的基础概念有字段条件规则等其中字段组成条件条件组成规则并且支持多个条件通过与或组成一个规则下面用常
一短文读懂编译型与解释型编程语言

在编程世界中我们经常听到编译型语言和解释型语言这两个术语它们是什么有什么区别呢让我们一起来探讨一下编译型语言编译型语言如C Java等是一种需要先被编译成机器代码然后才能被执行的语言你可以把它想象成一个笔译员他会先把你
用电器分析识别装置（2021 年全国大学生电子设计竞赛H题）

用电器分析识别装置 2021 年全国大学生电子设计竞赛H题摘要 1 系统方案 1 1 用电器分析识别装置的原理和结构 1 2 方案论证 1 2 1 系统供电论证和选择 1 2 2 采样方法论证和选择 1 2 3 采样芯片的型号选择 1 3
爆款短视频剪辑方法技巧，这样剪辑出来的短视频更容易爆,收藏

爆款短视频剪辑方法推荐这样剪辑出来的短视频更容易爆前面几篇内容我们从定位到脚本结构再到选题再到互动点和内容各方面都为短视频做好了素材准备后续我们也开始知道怎么写自己的文案了也告诉大家什么是一个好的表现力还有我们的景别我们的
内核对象

内核对象 1 什么是内核对象内核对象是内核分配的一段空间如文件对象和进程对象等可以用Windows提供的函数来创建相应的内核对象创建成功后返回一个对象句柄并且对象句柄值是进程相关的程序不能直接操作内核对象只能通过Windows
MFC学习笔记 — C++如何执行.exe文件

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XX 作者文化人 XX 联系方式或进群 471144274 XX 版权声明原创文章欢迎评论和转载转载时能告
随机森林实例（R语言实现）

1 可以先查询一下路径可以是数据所在的路径需要更改路径的话用setwd 路径 2 安装需要的包并使用 install package 包名 library 包名 randomForest 随机森林包 caret 常用于机器学习数据处理
【22-23 春学期】人工智能基础--AI作业9-卷积3-XO识别

1 For循环版本手工实现卷积池化激活 import numpy as np x np array 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
基于51单片机的智能照明控制系统

功能基于51单片机的智能照明控制系统以51系列单片机为核心使用光敏传感模块采用ADC0832对光敏电路进行AD转换红外传感模块与声敏传感模块组成检测装置并采用PWM对照明灯的光强度进行控制 1 本设计分为手动模式和自动模式可通
MyBatis-Plus 3 实现批量新增和批量修改

1 批量更新 mapper 接口批量方法插入 void batchInsert Param users List
智能合约开发solidity编程语言实例

智能合约开发用solidity编程语言部署在以太坊这个区块链平台本文提供一个官方实战示例快速入门用例子深入浅出智能合约开发体会以太坊构建去中心化可信交易技术魅力智能合约其实是执行合约条款的计算机交易协议区块链上的所有用户都可以看
Cascader 级联选择组件

乌鱼子一开始级联组件使用的是ElementUI 但是有一个bug 我自己做的是一个二级级联在选择了二级之后点击x删除选择再打开下拉框一级和二级都是展开的翻遍了文档都没有查到解决资料然后问了我们一个前端同事他说这是Elemen
网站密码明文传输解决方案js+java

解决密码明文传输的方案基本有两种解决方案 1 将项目网站全站升级为https协议如果要更谨慎还需要加密 2 将密码进行加密后在后台解密因项目升级https时间周期太长将暂时替代方案改为RSA加密解密方式最简单的方案前端加密
Deeplearning4j 实战（13-2）：基于Embedding+CNN的文本分类实现

Deeplearning4j 实战 13 2 基于Embedding CNN的文本分类实现 Eclipse Deeplearning4j GitChat课程 Deeplearning4j 快速入门专栏 Eclipse Deeplearni
解决FreeRTOS程序跑不起来，打印调试却提示“Error:..\FreeRTOS\port\RVDS\ARM_CM3\port.c,244“的方法

前言今天来分享一个不会造成程序编译报错但会使程序一直跑不起来并且通过调试会发现有输出错误提示的错误例子分析话不多说我们就直接开始分析首先我们说过这个例子在编译时候没有明示的错误提示也可以说没有语法和逻辑之类的错误应该是程序
Tomcat启动项目出错之45秒限制

今天启动项目发现项目启动时候并没有报错但是启动到一半的时候停下来了并且会提示xxx45m之类的原因是Tomcat默认启动项目的时长为45秒如果45秒内项目没启动好就会停止启动我们可以通过修改配置文件而达到更长的启动项目时间 1
Unet复现：遇到 block: [0,0,0], thread: [594,0,0] Assertion `t ＞= 0 && t ＜ n_classes` failed.题

复现参考链接 http t csdn cn sAPq1 在训练自己用labelme标注的图片时遇到上面提到的问题经过网上的分享总共分下面几种情况但是都还是无法解决我遇到的问题最后排查代码才找到问题 1 原作者说的一种情况 http t
ChatGLM-6B

ChatGLM 6B 是一个开源的支持中英双语的对话语言模型基于 General Language Model GLM 架构具有 62 亿参数结合模型量化技术用户可以在消费级的显卡上进行本地部署 INT4 量化级别下最低只需 6G
基于Matlab的量子粒子群算法优化单目标问题

基于Matlab的量子粒子群算法优化单目标问题量子粒子群算法 Quantum Particle Swarm Optimization QPSO 是一种基于自然界粒子群群体智能算法的优化方法 QPSO算法通过引入量子力学的概念将传统粒子群
Open AI 自监督学习笔记：Self-Supervised Learning

转载自微信公众号原文链接 https mp weixin qq com s biz Mzg4MjgxMjgyMg mid 2247486049 idx 1 sn 1d98375dcbb9d0d68e8733f2dd0a2d40 chksm

Open AI 自监督学习笔记：Self-Supervised Learning

Open AI 自监督学习笔记

文章目录

Outline

Introduction

What is self-supervised learning?

What’s Possible with Self-Supervised Learning?

Early Work

Early Work: Connecting the Dots

Restricted Boltzmann Machines

Autoencoder: Self-Supervised Learning for Vision in Early Days

Word2Vec: Self-Supervised Learning for Language

Autoregressive Modeling

Siamese Networks

Multiple Instance Learning & Metric Learning

Methods

Methods for Framing Self-Supervised Learning Tasks

Self-Prediction

Self-prediction: Autoregressive Generation

Self-Prediction: Masked Generation

Self-Prediction: Innate Relationship Prediction

Self-Prediction: Hybrid Self-Prediction Models

Contrastive Learning

Open AI 自监督学习笔记：Self-Supervised Learning 的相关文章

随机推荐

热门标签