Code Block & Basic Block

2023-11-05

Code Block

In a programming language, a code block typically starts with certain syntactical constructs such as loops, conditionals, or function definitions. When a compiler walks the Abstract Syntax Tree (AST), it uses this syntax information to recognize the beginning of a new code block. Here are some examples:

  1. Loop Statements: for, while, or do-while loops usually signal the beginning of a new code block in C, C++, or Java.
  2. Conditional Statements: if, else if, else branches also typically begin new code blocks.
  3. Function Definitions: When a new function is defined, that typically starts a new code block.
  4. Scope Blocks: Any pair of {} in C, C++, or Java starts a new code block. Similarly, indentation in Python starts a new block.

When the compiler encounters one of these nodes in the AST, it understands that a new code block is beginning. In response, it usually performs actions like:

  1. Creating a new scope for variable declarations.
  2. Initializing a new compilation unit or other data structures to hold the instructions generated within this code block.
  3. Recording the starting position for this block, for later back-patching or other forms of code modification.

For example, let’s consider a simple Python code block:

if x > 0:
    y = 2 * x

When the compiler encounters the if x > 0: node in the AST, it will recognize it as the beginning of a new code block. The compiler then typically creates a new compilation unit or similar structure to keep track of the variables, instructions, and other related information for that block.

So, recognizing the start of a code block is a function of both the language’s syntax and the corresponding nodes in the AST that the compiler uses for generating code.


The ending of a code block is determined by the programming language’s syntactical rules and is often marked by:

  1. Closing Brackets: In languages like C, C++, and Java, a closing curly bracket } signifies the end of a block of code.

  2. Indentation Level: In Python, the end of a block is usually determined by a change in indentation back to the level of the enclosing block.

  3. Explicit End Statements: Some languages like Ruby or older languages like Fortran use explicit end statements (end in Ruby, end do in Fortran) to mark the end of a block.

  4. End of File or Function: In some contexts, reaching the end of the file or the end of a function definition could implicitly signal the end of the outermost code block.

  5. Semicolons: In some languages, a semicolon ; is used to terminate single statements. Although not strictly an “end of block” marker, its presence often helps in recognizing the extent of certain kinds of code blocks.

In the compiler’s process:

  1. When walking the Abstract Syntax Tree (AST), encountering such ending markers in a node would signal the compiler to finish the current compilation unit or basic block.

  2. At this point, the compiler might finalize the generated bytecode or machine code for the block, pop the most recent scope off the scope stack, and perform any necessary cleanup activities like deallocating resources.

  3. For languages that use opening and closing pairs (like {} in C++ or Java), compilers often check that these are balanced, throwing a compile-time error if they’re not.

  4. If the language has block-level scope, variables that were declared in the block will typically go out of scope when the block ends.

For example, consider the following Python code:

if x > 0:
    y = 2 * x
# Block ends here due to change in indentation

In the Python example, the block ends when the indentation level returns back to the same level as the if statement, marking the end of the if block. The compiler, upon reaching this point, would know that the block has ended and proceed accordingly.

By understanding these syntactic markers and using the AST, a compiler can reliably determine where each code block begins and ends.

Basic Block

In compiler theory and program analysis, identifying the start and end of Basic Blocks often relies on control flow analysis of the program code. Basic blocks have several characteristics:

  1. Start Instruction: The first instruction in a basic block is one that is not the target of any jump (i.e., no other instruction jumps to this one).

  2. End Instruction: The last instruction in a basic block is usually a control-flow instruction, such as a conditional jump, an unconditional jump, or a return. After this, the control flow may divert to one or multiple different basic blocks.

  3. Intermediate Instructions: All intermediate instructions in a basic block are executed sequentially; there are no control flow jumps in or out in the middle of a basic block.

Therefore, identifying basic blocks usually involves the following steps:

  1. Identify All Jumps and Targets: Scan through the entire code, marking all control-flow changing instructions (like jmp, je, call, etc.) and their potential target addresses.

  2. Mark Start Points: The program entry point (usually the main function or equivalent) and all jump targets are potential start points for basic blocks.

  3. Mark End Points: All instructions that change control flow (jump or return) are potential end points for basic blocks.

  4. Divide Basic Blocks: Using the marked start and end points, divide up contiguous sequences of instructions to form basic blocks.

The purpose of doing this is to facilitate subsequent compiler optimizations like constant folding, dead code elimination, and more advanced optimizations like loop invariant code motion. Basic blocks serve as the fundamental unit for these optimizations, helping to simplify analysis and improve efficiency.

Differences

The terms “code block” and “basic block” have specific meanings in the context of programming and compiler theory. Here are some key differences:

Code Block

  1. Scope of Definition: A code block is a high-level concept that usually refers to a group of statements enclosed by delimiters like braces {} in languages like C, C++, and Java, or defined by indentation in languages like Python.

  2. Purpose: Code blocks typically serve to group statements together for the purpose of scoping or control flow (e.g., loops, conditionals).

  3. Contained Elements: A code block can contain variable declarations, control flow statements (e.g., if, else, while, for), and even other nested code blocks.

  4. Language Feature: The concept of a code block is a feature of the programming language’s syntax and semantics.

  5. Not Optimized: Code blocks are not units of optimization. Any optimization that happens is usually done at the level of basic blocks or across basic blocks.

Basic Block

  1. Scope of Definition: A basic block is a lower-level concept often used in compiler theory. It refers to a straight-line sequence of instructions with no branches in and no branches out, except possibly at the end.

  2. Purpose: The primary purpose of identifying basic blocks is for optimization during the compilation process. By understanding what each basic block does, a compiler can make intelligent choices about reordering instructions, eliminating redundancies, etc.

  3. Contained Elements: A basic block contains a list of machine instructions that are executed sequentially. It does not contain control flow statements as a code block does; rather, it is the target of control flow instructions.

  4. Compiler Feature: Basic blocks are usually an intermediate representation in the compilation process rather than a language feature.

  5. Optimization Unit: Basic blocks are the primary units of optimization in many compilation strategies. Various optimization algorithms work by considering the effects of a computation at the basic block level.

In summary, a code block is a syntactic construct that groups statements together for scoping and control flow, while a basic block is a straight-line sequence of instructions used for optimization by a compiler.

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

Code Block & Basic Block 的相关文章

  • 编译原理第七章笔记 -- 中间代码生成

    本文中内容整理西安交通大学软件学院吴晓军老师的ppt中 仅供学习使用 请勿转载或他用 参考教材 程序设计语言 编译原理 第3版 陈火旺等 国防工业出版社 这一章分数在35左右 两个大题 数组的引用四元式生成 控制语句当中布尔表达式的翻译 考
  • 编译原理实验 实验一 词法分析设计 Java实现

    一 实验目的 通过本实验的编程实践 使学生了解词法分析的任务 掌握词法分析程序设计的原理和构造方法 使学生对编译的基本概念 原理和方法有完整的和清楚的理解 并能正确地 熟练地运用 二 实验内容 用 VC VB JAVA 语言实现对 C 语言
  • 合肥工业大学编译原理实验二 LL1分析

    写在开头 当老师说这个实验最好写成图形界面时 我笑了 滑稽 心想终于可以用到python了 python真香 用python的数据结构可以很方便的表示LL1的某些东西 当然有利也有弊 方便的同时也会有一些坑 当然Java也牛逼 Java的图
  • 【编译原理】FIRST集合和FOLLOW集合

    FIRST集合 定义 可从 推导得到的串的首符号的集合 其中 是任意的文法符号串 规则 计算文法符号 X 的 FIRST X 不断运用以下规则直到没有新终结符号或 可以被加入为止 1 如果 X 是一个终结符号 那么 FIRST X X 2
  • 编译原理笔记

    目录 序章 编译原理 编译器 程序设计语言 第一章 概述 机器语言 第一代语言 特点 汇编语言 高级程序设计语言 鼻祖 时期 特点 翻译程序 汇编语言 解释语言 编译程序 编译过程 词法分析 语法分析 语义分析 中间代码生成 之前三步都是编
  • 电子科技大学编译原理复习笔记(四):程序语言的设计

    目录 前言 重点一览 语言的定义 比较 生成观点与识别观点 语义又该怎么描述 符号串 符号串集合 文法 超重点 定义 组成 表示 分类 重点 文法产生的语言 短语 直接短语和句柄 求它们目的是语法分析 语法树 推导树 语言的设计 本章习题
  • 【编译原理】flex实现词法分析器

    flex自动实现词法分析器 FLEX 与 BISON 的使用 FLEX介绍 Flex是一个生成词法分析器的工具 它可以利用正则表达式来生成匹配相应字符串的C语言代码 其语法格式基本同Lex相同 单词的描述称为模式 Lexical Patte
  • 静态类型推导

    前面说泛型的时候 提到了C 模板的实现方式是动态特性静态化 在实际情况中 这是一个提高效率的好办法 动态性的好处是灵活 开发简便 静态性的特性是效率高 编译期检查较好 因此很自然地就有一个问题 能不能各取所长 达到两全其美 应该说 在一定程
  • 移入——归约技术

    归约 定义 我们可以将自底向上语法分析过程看成是建一个串w 归约 慰问发开始符号的过程 在归约中 一个与某产生式体相匹配的特定子串被替换为该产生式的头部的非终结符号 定义理解起来比较晦涩 我们来看个例子就知道了 已知有文法 E gt E T
  • 编译课设 (词法分析+LR1语法分析+语法制导翻译(四元式生成))

    代码已上传至 Github 完整的 VS2019 项目已上传至百度云 提取码 lql1 目录 源语言 语义动作 中间代码定义 整体框架 声明 语句 i f if if 语句
  • C++ 实现自动产生LR1分析器的产生器

    C 实现自动产生LR1分析器的产生器 1 介绍 2 总体思路 2 1 拓广文法 2 2 计算First集合 2 3 计算每个闭包的项目集以及GO函数 2 4 计算分析表的动作函数ACTION和状态转换函数GOTO 2 5 对语句进行语法分析
  • 初识 flex & bison

    基本概念 flex 和 bison 经常结合使用 分别用于词法分析和语法分析 词法分析器 flex flex 用于生成词法分析器或者说是扫描器 scanner 它将输入的文本分解为称为 tokens 的序列 每个 token 都有一个特定的
  • [学习flex] 1.利用flex实现文字和谐小程序

    灵感来自于09平台dota1 游戏选手对喷时经常互飙国粹 问候对方全家 后来09平台进行了聊天和谐 不和谐的文字都会被 替换 今天我就就用flex实现类似的效果 话不多说上flex代码 脏话 printf 国粹 printf printf
  • GDB 程序调试常用命令

    调试之前 若要在GDB中调试程序在编译时需要加上调试信息 在GCC中添加的方法 GCC g a c o a exe 或下面提供更符合GDB的调试信息 GCC ggdb a c o a exe 运行流程 命令 作用 start 开始执行程序
  • 词法分析器(分析C语言)

    问题描述 用C或C 语言编写一个简单的词法分析程序 扫描C语言小子集的源程序 根据给定的词法规则 识别单词 填写相应的表 如果产生词法错误 则显示错误信息 位置 并试图从错误中恢复 简单的恢复方法是忽略该字符 或单词 重新开始扫描 相关词法
  • The Backus-Naur Form (BNF) & The Extended Backus-Naur Form (EBNF)

    The Backus Naur Form BNF The Backus Naur Form BNF is a notation used for formal description of the syntax of programming
  • 编译原理三大经典书籍(龙书 虎书 鲸书)

    1 龙书 Dragon book 英文名 Compilers Principles Techniques and Tools 作者 Alfred V Aho Ravi Sethi Jeffrey D Ullman 中文名 编译原理技术和工具
  • 编译原理LL(1)文法之提取左公因子,消除左递归

    在判断LL 1 文法是否符合的时候 需要判断LL 1 文法是否存在左公因子 和左递归的情况 以下给出相应的判断方法以及通过提取左公因子和消除左递归使非LL 1 文法转换为LL 1 法的方法 第一种情况 存在左公因子 解决方法 提取左公因子
  • 语义分析- 符号表

    符号表 1 用来存储程序中的变量相关信息 类型 作用域 访问控制信息 2 必须非常高效 程序中的变量规模会很大 符号表的接口 ifndef TABLE H define TABLE H typedef Table t 数据结构 新建一个符号
  • Go 程序编译过程(基于 Go1.21)

    版本说明 Go 1 21 官方文档 Go 语言官方文档详细阐述了 Go 语言编译器的具体执行过程 Go1 21 版本可以看这个 https github com golang go tree release branch go1 21 sr

随机推荐

  • python实现简单的百度搜索

    usr bin python coding utf 8 import urllib import urllib2 实现百度关键字查询的小例子 定义基础url url http www baidu com s 定义请求头信息 headers
  • 实现字符串倒叙

    var reverse function str 倒叙的函数 return str split reverse join split切割字符串然后转换为数组 reverse是jquery的倒序方法 然后join是将其放到字符串中 let a
  • 单极性PWM和双极性PWM

    单极性与双极性PWM模式 从调制脉冲的极性看 PWM又可分为单极性与双极性控制模式两种 单极性PWM模式 产生单极性PWM模式的基本原理如图6 2所示 首先由同极性的三角波载波信号ut 与调制信号ur 比较 图6 2 a 产生单极性的PWM
  • 漫画:什么是区块链?

    点击上方 程序员小灰 选择 置顶公众号 有趣有内涵的文章第一时间送达 什么是区块链 区块链 英文 Blockchain 本质上是一种去中心化的分布式数据库 任何人只要架设自己的服务器 接入区块链网络 都可以成为这个庞大网络的一个节点 区块链
  • WScript.CreateObject(WScript.Shell)

    为什么 WScript CreateObject WScript Shell 无法执行 源 VBS 程序 Dim t Set t WScript CreateObject WScript Shell Set t Nothing WScrip
  • MySQL text类型的最大长度

    MySQL 3种text类型的最大长度如下 TEXT 65 535 bytes 64kb MEDIUMTEXT 16 777 215 bytes 16Mb LONGTEXT 4 294 967 295 bytes 4Gb 参考 http w
  • cmake的macro

    一 定义 可以把它理解为C 的宏 命令如下 macro
  • The connection to adb is down, and a severe error has occured.

    报错 The connection to adb is down and a severe error has occured 解决 cmd跳到sdk tools文件路径下 adb kill server 然后再adb start serv
  • Kotlin_读写文件

    读写文件操作记录 提取成函数 方便看其返回值 以加深理解 private fun createNewFile File var file File output txt if file exists file delete file cre
  • MES :制造执行系统 (Manufacturing Execution System)

    MES是美国管理界90年代提出的新概念 美国先进制造研究机构AMR Advanced Manufacturing Research 通过对大量企业的调查发现现有的企业生产管理系统普遍由以ERP MRPII为代表的企业管理软件 以SCADA
  • 在gitlab中生成增量代码质量分析报告

    作为管理者 你是否想在组员创建merge request时 生成代码质量分析报告 今天它来了 gitlab ci yml image python 3 11 flake8 allow failure true rules 只有flake8任
  • centos等重新编译rpm包笔记备忘

    源码包获取 直接浏览器下载或者添加source源后 直接 yumdownloader source kernel 或者dnf命令 源码包编译依赖包安装 编译之前还需要补齐编译这个包需要的依赖 当然可以rpmbuild命令提示后一个一个补 网
  • STM32中如何用systick中断来监控系统的运行时间

    define SysTick CTRL TICKINT Pos 1U define SysTick CTRL TICKINT Msk 1UL lt lt SysTick CTRL TICKINT Pos define DRV SYS TIC
  • 关于批量添加用户和域用户

    首先看批量添加用户 有三种方法 1 开始 运行 CMD 输入 for l i in 1 1 50 do net user test i 123456 add 注 1 1 50 的意思是 开始值 递增量 终值 如果想递减 50 1 1 tes
  • pycharm专业版许可证申请(特定人群)

    来到官网 PyCharm the Python IDE for Professional Developers by JetBrains 翻译一下 我是学生党 所以选择第一个 选择官方文件 这里需要学信网在新验证 学信网官网在此 中国高等教
  • webpack 如何自定义loader

    webpack中loader本质就是函数 其中前一个loader处理完代码后 交给后一个代码继续处理 最终经过多个loader的处理后 源代码变成最终代码 自定义loader其实就是自己写一个函数 在把函数导出 写在rule中即可 如图所示
  • twitter_充分利用Twitter的12种方法

    twitter There is no doubt that Twitter has been one of the hottest new web applications of the past couple of years Sinc
  • QT类的构造函数和析构函数在main函数中被引用

    问题描述 原有头文件a h 源文件a cpp main中调用a中的类A 新建头文件b h 源文件b cpp 将原来调用类A改为b中的类B 直接运行报错LNK2019 类B中的构造函数和析构函数在main函数中被引用 项目清除后重新构建依旧无
  • Bash脚本学习 - 条件句、数组、for循环,函数

    1 条件测试 和 是一个用于执行条件测试的命令 它们必须用空格分隔开 并且在 后面和 前面必须有空格 eq 是一个比较运算符 表示等于 equal 它用于比较两个值是否相等 2 条件句 在 ifelseifelse sh 文件中 bin b
  • Code Block & Basic Block

    Code Block In a programming language a code block typically starts with certain syntactical constructs such as loops con