apache druid学习之Processes and servers

2023-10-31

Processes and servers · Apache Druid

Process types

Druid has several process types:

Server types

Druid processes can be deployed any way you like, but for ease of deployment we suggest organizing them into three server types:

  • Master
  • Query
  • Data

  •  Coordinator processes manage data availability on the cluster. --数据的调度
  • Overlord processes control the assignment of data ingestion workloads. --控制数据的摄入和分配
  • Broker processes handle queries from external clients. --处理客户端的请求
  • Router processes are optional; they route requests to Brokers, Coordinators, and Overlords.--路由器,查询的适合选择那几个节点去处理
  • Historical processes store queryable data. 处理存储历史查询的数据(负责存和查) 缓存?
  • MiddleManager processes ingest data.--处理摄入的数据(实时数据和index)
  • Master: Runs Coordinator and Overlord processes, manages data availability and ingestion. --负责数据的可用的摄取
  • Query: Runs Broker and optional Router processes, handles queries from external clients. --处理外部请求 query不存储数据 
  • Data: Runs Historical and MiddleManager processes, executes ingestion workloads and stores all queryable data --数据真正存储的地方

Master(包含Coordinator and Overlord )

A Master server manages data ingestion and availability: it is responsible for starting new ingestion jobs and coordinating availability of data on the "Data servers" described below.

Within a Master server, functionality is split between two processes, the Coordinator and Overlord.

负责数据的可用和摄入,负责启动数据摄入任务,协调数据的可用。

Coordinator process (协调进程)

Coordinator processes watch over the Historical processes on the Data servers. They are responsible for assigning segments to specific servers, and for ensuring segments are well-balanced across Historicals.

监听Historical进程, 负责安排segment分配到哪一台服务器,使得segmemt在多台historical上负载均衡

Overlord process(霸王进程?)

Overlord processes watch over the MiddleManager processes on the Data servers and are the controllers of data ingestion into Druid. They are responsible for assigning ingestion tasks to MiddleManagers and for coordinating segment publishing.

监听MiddleManager进程,是数据摄入到druid的控制器,负责安排数据摄入工作到各个MiddleManagers 同时协调segment的发布。

Data server

A Data server executes ingestion jobs and stores queryable data.

Within a Data server, functionality is split between two processes, the Historical and MiddleManager.

DATA 主要是执行数据摄入工作并且存储可查询的数据,一般来说一个DATA就包含 Historical and MiddleManager


Historical process

Historical processes are the workhorses that handle storage and querying on "historical" data (including any streaming data that has been in the system long enough to be committed). Historical processes download segments from deep storage and respond to queries about these segments. They don't accept writes.

历史进程处理存储和查询“历史”数据(包括在系统中存在足够长时间将被提交的任何流数据),历史进程从深度存储下载数据段,并响应有关这些数据段的查询。他们不接受写请求


Middle Manager process

MiddleManager processes handle ingestion of new data into the cluster. They are responsible for reading from external data sources and publishing new Druid segments.

中间管理进程处理新数据的摄入,主要负责从其他数据源(kafka)读数据然后形成segment,主要负责写请求


Peon processes

Peon processes are task execution engines spawned by MiddleManagers. Each Peon runs a separate JVM and is responsible for executing a single task. Peons always run on the same host as the MiddleManager that spawned them.

牡丹进程。。是由MiddleManager生成的任务执行引擎。每个牡丹运行一个单独的JVM,并负责执行单个任务。牡丹始终与产生它们的MiddleManager在同一主机上运行。


Indexer process (optional)

Indexer processes are an alternative to MiddleManagers and Peons. Instead of forking separate JVM processes per-task, the Indexer runs tasks as individual threads within a single JVM process.

索引进程是中间管理器和牡丹的替代方案。索引进程不是将每个任务切分为单独的JVM进程,而是将任务作为单个JVM进程中的单个线程运行。

The Indexer is designed to be easier to configure and deploy compared to the MiddleManager + Peon system and to better enable resource sharing across tasks. The Indexer is a newer feature and is currently designated experimental due to the fact that its memory management system is still under development. It will continue to mature in future versions of Druid.

与牡丹+中间管理器相比 索引进程更易于配置和部署,并更好地实现任务间的资源共享。索引器是一个较新的功能。

Typically, you would deploy either MiddleManagers or Indexers, but not both.

二选其一!二选其一!二选其一!二选其一!


Pros and cons of colocation

Druid processes can be colocated based on the Master/Data/Query server organization as described above. This organization generally results in better utilization of hardware resources for most clusters.

For very large scale clusters, however, it can be desirable to split the Druid processes such that they run on individual servers to avoid resource contention.

This section describes guidelines and configuration parameters related to process colocation.


Coordinators and Overlords

The workload on the Coordinator process tends to increase with the number of segments in the cluster. The Overlord's workload also increases based on the number of segments in the cluster, but to a lesser degree than the Coordinator.

In clusters with very high segment counts, it can make sense to separate the Coordinator and Overlord processes to provide more resources for the Coordinator's segment balancing workload.


Unified Process

The Coordinator and Overlord processes can be run as a single combined process by setting the druid.coordinator.asOverlord.enabled property.

Please see Coordinator Configuration: Operation for details.

Coordinator's and Overlords分开部署


Historicals and MiddleManagers

With higher levels of ingestion or query load, it can make sense to deploy the Historical and MiddleManager processes on separate hosts to to avoid CPU and memory contention.

The Historical also benefits from having free memory for memory mapped segments, which can be another reason to deploy the Historical and MiddleManager processes separately.

 Historicals and MiddleManagers分开部署分开部署

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

apache druid学习之Processes and servers 的相关文章

随机推荐

  • CSS&JS两种方式实现手风琴式折叠菜单

    div class accordion div class section h3 a href one 折叠栏1 a h3 div class image img src div div div class section h3 a hre
  • Maximum call stack size exceeded

    前言 小程序探究之路 报错显示 报错原因 这个意思是 超出最大调用堆栈大小 我这里是调用了腾讯的地图 然后排查各种情况 最后发现是我再注册的时候 直接用了map这个名字 但是实际上 我们调用腾讯接口用的也是map的标签 错误代码 1 调用界
  • u-boot项目管理:.config的生成

    总体简略描述 config是 scripts kconfig conf程序从根目录开始分析Kconfig文件 并结合xxx deconfig生成的配置文件 1 输入命令 make zynq defconfig 2 开始解析 Makefile
  • 需求跟踪矩阵实例_如何利用seaborn进行矩阵散点图(Pairs Plots)的绘制?

    点击上方蓝字 关注我们 如何快速创建强大的可视化探索性数据分析 这对于现在的商业社会来说 变得至关重要 今天我们就来 谈一谈如何使用 python 来进行数据的可视化 一旦你有了一个很好的被清理过的数据集 下一步就是探索性数据分析 EDA
  • Mybatis Plus实现逻辑删除

    文章目录 1 什么是逻辑删除 2 使用说明 3 如何使用Mybatis Plus实现逻辑删除 4 测试 1 什么是逻辑删除 逻辑删除是名义上的删除 就是对要要删除的数据打上一个删除标记 在逻辑上数据是被删除的 但数据本身依然存在 可通过修改
  • MHA-6 两个脚本 master_ip_failover master_ip_online_change

    master ip failover usr bin env perl use strict use warnings FATAL gt all use Getopt Long my command ssh user orig master
  • 2023天梯赛个人题解

    文章目录 L1 1 最好的文档 L1 2 什么是机器学习 L1 3 程序员买包子 L1 4 进化论 L1 5 猜帽子游戏 L1 6 剪切粘贴 L1 7 分寝室 L1 8 谁管谁叫爹 L2 1 堆宝塔 L2 2 天梯赛的赛场安排 L2 3 锦
  • vue通过el-upload组件上传文件到服务器使用总结

    vue通过el upload组件上传文件到服务器使用总结 1 业务需求 vue项目需要通过elementUI的el upload组件上传文件 但是一般情况下请求接口是需要携带token值的 如果不携带则上传不成功 而el upload是通过
  • python为什么叫爬虫?Python和爬虫有什么关系?

    提到Python有同学自然就想到爬虫 但实际上Python和爬虫并不是一个概念的东西哦 下面小千就来给大家介绍一下 爬虫 爬虫又称网络爬虫 又称为网页蜘蛛 网络机器人 在FOAF社区中间 经常的称为网页追逐者 是一种按照一定的规则 自动地抓
  • 【单目标优化算法】沙猫群优化算法(Matlab代码实现)

    欢迎来到本博客 博主优势 博客内容尽量做到思维缜密 逻辑清晰 为了方便读者 座右铭 行百里者 半于九十 本文目录如下 目录 1 概述 2 运行结果 3 参考文献 4 Matlab代码实现 1 概述 本研究提出了一种新的元启发式算法 称为沙猫
  • mysql第三次上机

    Mysql第三次上机 上机3 1 基于第一次上机创建的银行数据库 创建一个函数 为所有存款账户增加1 的利息 2 创建一个新表branch total 用于存储各个支行的存款总额 表中有branch name和total balance两个
  • 【深入理解Java虚拟机】内存管理和对象访问

    一 JVM内存区域划分 1 程序计数器 线程私有 类似于eclipse中断点程序 行号指示器 记录了程序下一步需要执行的字节码指令 分支 循环等分支 线程私有 每个线程有一个程序计数器 程序计数器是为了多线程情况下 线程执行切换后 处理器回
  • 7大嵌入式开发技巧,你知道吗?

    成为一个正式的工程师 它是一个艰辛的过程 需要开发人员维护和管理系统的每个比特和字节 从规范完善的开发周期到严格执行和系统检查 开发高可靠性嵌入式系统的技术有许多种 今天给大家介绍7个易操作且可以长久使用的技巧 它们对于确保系统更加可靠地运
  • ‘执行力’

    执行力 执行力是一种能力 更是一种态度 很多时候 因为犹豫不决 导致很多不应该发生的事情发生了 从今天起 锻炼自己的执行能力 从生活中的的小事开始 起床铃声响起 不赖床 想到什么就去实践 结果如何那都是后话 如果都不去做 何来的结果
  • [C4W4] Convolutional Neural Networks - Special applications: Face recognition & Neural style transfe...

    第四周 Special applications Face recognition Neural style transfer 什么是人脸识别 What is face recognition 欢迎来到第四周 即这门课卷积神经网络课程的最后
  • javascript 获取 全部cookie(以对象形式返回)

    javascript 获取 cookie 以对象形式返回 建议定义全局变量cookie if document cookie var cookie eval document cookie replaceAll replaceAll els
  • linux vmstat io,Linux下vmstat调优工具的深入分析 (*****)

    vmstat procs memory swap io system cpu r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 0 310596 24796 1437
  • mybatis概述及搭建

    目录 1 概述 2 mybatis搭建 1 创建一个maven项目 添加mybatis mysql所依赖的jar 2 创建一个数据库表 及对应的java类 3 创建一个mybatis的核心配置文件 配置数据库连接信息 配置sql映射文件 4
  • 【Unity-Cinemachine相机】相机跟随之Transposer属性

    相机跟随和瞄准行为 Transposer 虚拟相机将在某个固定的偏移或距离上跟随目标移动 上面的偏移量就是Follow Offset Binding Mode决定Follow Offset是目标本地坐标系下的身后十米还是世界坐标系下的身后十
  • apache druid学习之Processes and servers

    Processes and servers Apache Druid Process types Druid has several process types Coordinator Overlord Broker Historical