拥有不同列的DataFrame的行连接 concat 函数

2023-05-16

from pandas import read_csv,concat,Series,DataFrame

#DataFrame
s1 = read_csv("concat_1.csv")
s2 = read_csv("concat_2.csv")
s3 = read_csv("concat_3.csv")

print("-----------s1-----------------")
print(s1)

print("------------s2----------------")
print(s2)

print("-------------s3---------------")
print(s3)


#修改列表名称
s1.columns = ["A","B","C","D"]
s2.columns = ["E","F","G","H"]
s3.columns = ["A","C","F","H"]


print("-----------s1-----------------")
print(s1)

print("------------s2----------------")
print(s2)

print("------------s3----------------")
print(s3)

print("-----------默认的合并 sort=True--------------")
print(concat([s1,s2,s3],sort=True))


print("-------------s1s2s3 join=inner---------------")
print(concat([s1,s2,s3],join="inner"))

print("-------------s1s3 join=inner---------------")
print(concat([s1,s3],join="inner"))

print("-------------s2s3---------------")
print(concat([s2,s3],join="inner"))

output

-----------s1-----------------
    A   B   C   D
0  a0  b0  c0  d0
1  a1  b1  c1  d1
2  a2  b2  c2  d2
3  a3  b3  c3  d3
------------s2----------------
    A   B   C   D
0  a4  b4  c4  d4
1  a5  b5  c5  d5
2  a6  b6  c6  d6
3  a7  b7  c7  d7
-------------s3---------------
     A    B    C    D
0   a8   b8   c8   d8
1   a9   b9   c9   d9
2  a10  b10  c10  d10
3  a11  b11  c11  d11
-----------s1-----------------
    A   B   C   D
0  a0  b0  c0  d0
1  a1  b1  c1  d1
2  a2  b2  c2  d2
3  a3  b3  c3  d3
------------s2----------------
    E   F   G   H
0  a4  b4  c4  d4
1  a5  b5  c5  d5
2  a6  b6  c6  d6
3  a7  b7  c7  d7
------------s3----------------
     A    C    F    H
0   a8   b8   c8   d8
1   a9   b9   c9   d9
2  a10  b10  c10  d10
3  a11  b11  c11  d11
-----------默认的合并 sort=True--------------
     A    B    C    D    E    F    G    H
0   a0   b0   c0   d0  NaN  NaN  NaN  NaN
1   a1   b1   c1   d1  NaN  NaN  NaN  NaN
2   a2   b2   c2   d2  NaN  NaN  NaN  NaN
3   a3   b3   c3   d3  NaN  NaN  NaN  NaN
0  NaN  NaN  NaN  NaN   a4   b4   c4   d4
1  NaN  NaN  NaN  NaN   a5   b5   c5   d5
2  NaN  NaN  NaN  NaN   a6   b6   c6   d6
3  NaN  NaN  NaN  NaN   a7   b7   c7   d7
0   a8  NaN   b8  NaN  NaN   c8  NaN   d8
1   a9  NaN   b9  NaN  NaN   c9  NaN   d9
2  a10  NaN  b10  NaN  NaN  c10  NaN  d10
3  a11  NaN  b11  NaN  NaN  c11  NaN  d11
-------------s1s2s3 join=inner---------------
Empty DataFrame
Columns: []
Index: [0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3]
-------------s1s3 join=inner---------------
     A    C
0   a0   c0
1   a1   c1
2   a2   c2
3   a3   c3
0   a8   b8
1   a9   b9
2  a10  b10
3  a11  b11
-------------sss---------------
     F    H
0   b4   d4
1   b5   d5
2   b6   d6
3   b7   d7
0   c8   d8
1   c9   d9
2  c10  d10
3  c11  d11

Process finished with exit code 0

源码

def concat(objs, axis=0, join=’outer’, join_axes=None, ignore_index=False,
keys=None, levels=None, names=None, verify_integrity=False,
sort=None, copy=True):

def concat(objs, axis=0, join='outer', join_axes=None, ignore_index=False,
           keys=None, levels=None, names=None, verify_integrity=False,
           sort=None, copy=True):

    """

    Parameters
    ----------

    objs : a sequence or mapping of Series, DataFrame, or Panel objects
        If a dict is passed, the sorted keys will be used as the `keys`
        argument, unless it is passed, in which case the values will be
        selected (see below). Any None objects will be dropped silently unless
        they are all None in which case a ValueError will be raised

   axis : {0/'index', 1/'columns'}, default 0
        The axis to concatenate along


    join : {'inner', 'outer'}, default 'outer'
        How to handle indexes on other axis(es)


    join_axes : list of Index objects
        Specific indexes to use for the other n - 1 axes instead of performing
        inner/outer set logic


   ignore_index : boolean, default False
        If True, do not use the index values along the concatenation axis. The
        resulting axis will be labeled 0, ..., n - 1. This is useful if you are
        concatenating objects where the concatenation axis does not have
        meaningful indexing information. Note the index values on the other
        axes are still respected in the join.
    keys : sequence, default None
        If multiple levels passed, should contain tuples. Construct
        hierarchical index using the passed keys as the outermost level
    levels : list of sequences, default None
        Specific levels (unique values) to use for constructing a
        MultiIndex. Otherwise they will be inferred from the keys
    names : list, default None
        Names for the levels in the resulting hierarchical index
    verify_integrity : boolean, default False
        Check whether the new concatenated axis contains duplicates. This can
        be very expensive relative to the actual data concatenation


    sort : boolean, default None
        Sort non-concatenation axis if it is not already aligned when `join`
        is 'outer'. The current default of sorting is deprecated and will
        change to not-sorting in a future version of pandas.

        Explicitly pass ``sort=True`` to silence the warning and sort.
        Explicitly pass ``sort=False`` to silence the warning and not sort.

        This has no effect when ``join='inner'``, which already preserves
        the order of the non-concatenation axis.

        .. versionadded:: 0.23.0

    copy : boolean, default True
        If False, do not copy data unnecessarily

    Returns
    -------
    concatenated : object, type of objs
        When concatenating all ``Series`` along the index (axis=0), a
        ``Series`` is returned. When ``objs`` contains at least one
        ``DataFrame``, a ``DataFrame`` is returned. When concatenating along
        the columns (axis=1), a ``DataFrame`` is returned.

    Notes
    -----
    The keys, levels, and names arguments are all optional.

    A walkthrough of how this method fits in with other tools for combining
    pandas objects can be found `here
    <http://pandas.pydata.org/pandas-docs/stable/merging.html>`__.



     See Also
    --------
    Series.append
    DataFrame.append
    DataFrame.join
    DataFrame.merge

    Examples


    Combine two ``Series``.
    ----------------------------

    >>> s1 = pd.Series(['a', 'b'])
    >>> s2 = pd.Series(['c', 'd'])
    >>> pd.concat([s1, s2])
    0    a
    1    b
    0    c
    1    d
    dtype: object


    ignore_index=True
    -------------------------------------

    Clear the existing index and reset it in the result
    by setting the ``ignore_index`` option to ``True``.

    >>> pd.concat([s1, s2], ignore_index=True)
    0    a
    1    b
    2    c
    3    d
    dtype: object


    keys=['s1', 's2',]
    --------------------------------------

    Add a hierarchical index at the outermost level of
    the data with the ``keys`` option.

    >>> pd.concat([s1, s2], keys=['s1', 's2',])
    s1  0    a
        1    b
    s2  0    c
        1    d
    dtype: object

    Label the index keys you create with the ``names`` option.

    >>> pd.concat([s1, s2], keys=['s1', 's2'],
    ...           names=['Series name', 'Row ID'])
    Series name  Row ID
    s1           0         a
                 1         b
    s2           0         c
                 1         d
    dtype: object

    Combine two ``DataFrame`` objects with identical columns.

    >>> df1 = pd.DataFrame([['a', 1], ['b', 2]],
    ...                    columns=['letter', 'number'])
    >>> df1
      letter  number
    0      a       1
    1      b       2
    >>> df2 = pd.DataFrame([['c', 3], ['d', 4]],
    ...                    columns=['letter', 'number'])
    >>> df2
      letter  number
    0      c       3
    1      d       4
    >>> pd.concat([df1, df2])
      letter  number
    0      a       1
    1      b       2
    0      c       3
    1      d       4

    Combine ``DataFrame`` objects with overlapping columns
    and return everything. Columns outside the intersection will
    be filled with ``NaN`` values.

    >>> df3 = pd.DataFrame([['c', 3, 'cat'], ['d', 4, 'dog']],
    ...                    columns=['letter', 'number', 'animal'])
    >>> df3
      letter  number animal
    0      c       3    cat
    1      d       4    dog
    >>> pd.concat([df1, df3])
      animal letter  number
    0    NaN      a       1
    1    NaN      b       2
    0    cat      c       3
    1    dog      d       4


    join="inner"
    ------------------------------

    Combine ``DataFrame`` objects with overlapping columns
    and return only those that are shared by passing ``inner`` to
    the ``join`` keyword argument.

    >>> pd.concat([df1, df3], join="inner")
      letter  number
    0      a       1
    1      b       2
    0      c       3
    1      d       4



    axis=1
    ---------------------------------

    Combine ``DataFrame`` objects horizontally along the x axis by
    passing in ``axis=1``.

    >>> df4 = pd.DataFrame([['bird', 'polly'], ['monkey', 'george']],
    ...                    columns=['animal', 'name'])
    >>> pd.concat([df1, df4], axis=1)
      letter  number  animal    name
    0      a       1    bird   polly
    1      b       2  monkey  george

    Prevent the result from including duplicate index values with the
    ``verify_integrity`` option.

    >>> df5 = pd.DataFrame([1], index=['a'])
    >>> df5
       0
    a  1
    >>> df6 = pd.DataFrame([2], index=['a'])
    >>> df6
       0
    a  2
    >>> pd.concat([df5, df6], verify_integrity=True)
    Traceback (most recent call last):
        ...
    ValueError: Indexes have overlapping values: ['a']
    """
    op = _Concatenator(objs, axis=axis, join_axes=join_axes,
                       ignore_index=ignore_index, join=join,
                       keys=keys, levels=levels, names=names,
                       verify_integrity=verify_integrity,
                       copy=copy, sort=sort)
    return op.get_result()

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)

DataFrame

concat

拥有不同列

拥有不同列的DataFrame的行连接 concat 函数的相关文章

在 R 中的 dplyr 中分组一些其他变量后，如何保留其他变量？

之前今天我发布了这个问题here https stackoverflow com questions 72461943 how can i add missing month value and remove duplicate in dp
根据列条件连接数据框行

为了后续的讨论我将参考下面的示例数据框现在我希望实现的是将所有相似的数据包时间分组即所有 7s 12s 等此外 PacketTime字段应包含最小值和最大值的差异 max PacketTime min PacketTime 以及F
获取列名，其中值是 pandas 数据框中的内容

我试图在每个时间戳找到数据帧中的列名称其值与同一时间戳的时间序列中的列名称相匹配这是我的数据框 gt gt gt df col5 col4 col3 col2 col1 1979 01 01 00 00 00 1181 220328 9
Pivot_longer 6 列至 3 列

我知道我的问题很简单但我整个早上都在尝试但我无法解决这个问题我有这个数据框 GeneID Gene Symbol01 Ratio 2h Ratio 6h Ratio 10h Ratio 24h Pvalue 2h 1 174 FUT
使用平均值填充 pandas 数据框中的缺失值

datetime 2012 01 01 125 5010 2012 01 02 NaN 2012 01 03 125 5010 2013 01 04 NaN 2013 01 05 125 5010 2013 02 28 125 5010 2
Python 显示指向数据框的 HTML 箭头

我创建了一个数据框 df Value Change Direction Date 2015 03 02 2117 38 NaN 0 2015 03 03 2107 79 9 609864 0 2015 03 04 2098 59 9 250
删除 DataFrame 列中仅出现一次的值

我有一个列中具有不同值的数据框x 我想删除列中仅出现一次的值 So this x 1 10 2 30 3 30 4 40 5 40 6 50 应该变成这样 x 2 30 3 30 4 40 5 40 我想知道是否有办法做到这一点您可以通过
R：循环数据框，根据日期提取数据子集

我有一个大型数据框其中包含如下所示的数据 date w x y z region 1 2012 01 21 43 12 3 NORTH 2 2012 02 32 54 21 16 NORTH 3 2012 03 14 32 65 32 N
为什么pivot_wider要么将单个值读取为重复项，要么创建一个宽而长的小标题（不合并行）？

我浏览了此处发布的大部分相关问题但似乎没有一个问题与我面临的问题相同根据我的阅读此处已经发布的问题与长格式数据中的重复值缺乏唯一标识符有关这会导致带有列表列的宽格式数据这通常可以通过创建虚拟变量列来解决这是一串唯一的数字我已
从多个数据帧中提取公共行的子集

我有多个数据框如下所述每行都有唯一的 id 我试图找到公共行并创建一个至少出现在两个数据框中的新数据框示例 Id 2 的行出现在所有三个数据框中类似地 df1 和 df3 中存在 Id 3 的行我想创建一个循环可以找到公共行并创
Pandas 数据框获取每组的第一行

我有一只熊猫DataFrame像下面这样 df pd DataFrame id 1 1 1 2 2 3 3 3 3 4 4 5 6 6 6 7 7 value first second second first second first t
将 pandas 数据帧与 apply(lambda) 的结果连接起来，其中 lambda 返回另一个数据帧

数据帧在列中存储一些值将这些值传递给函数我得到另一个数据帧我想将返回的数据帧的列连接到原始数据帧我尝试做类似的事情 i pd concat i i cid id apply lambda x xy x axis 1 axis 1 但
查找每个 pandas 数据帧行中前 n 个最高值列的名称

我有以下数据框 id p1 p2 p3 p4 1 0 9 1 4 2 0 2 3 4 3 1 3 10 7 4 1 5 3 1 5 2 3 7 10 我需要以一种方式重塑数据框对于每个 id 来说它将具有具有最高值的前 3 列结果会是
根据大量 python 数据帧的字数删除关键字

如果我有这个df具有 41 000 行的数据框包含数千个单词例如像这样df column1 column2 better spotted better rights rights rights fresh fresh rights rig
PySpark 将“map”类型的列转换为数据框中的多列

Input 我有一个专栏Parameters类型的map形式 from pyspark sql import SQLContext sqlContext SQLContext sc d Parameters foo 1 bar 2 baz
逐行比较两个不同长度的数据帧，并为每行添加具有相等值的列

我在 python pandas 中有两个不同长度的数据帧如下所示 df1 df2 Column1 Column2 Column3 ColumnA ColumnB 0 1 a r 0 1 a 1 2 b u 1 1 d 2 3 c k 2
将 Pandas GroupBy 多索引输出从 Series 转换回 DataFrame

我有一个数据框 City Name 0 Seattle Alice 1 Seattle Bob 2 Portland Mallory 3 Seattle Mallory 4 Seattle Bob 5 Portland Mallory 我执
在 Spark Dataframe 中将空值替换为 null

我有一个包含 n 列的数据框我想用空值替换所有这些列中的空字符串我尝试使用 val ReadDf rawDF na replace columnA Map gt null and val ReadDf rawDF withColumn
Pyspark 将多个列合并为一个 json 列

我不久前问过 python 的问题但现在我需要在 PySpark 中做同样的事情我有一个像这样的数据框 df cust id address store id email sales channel category 1234567 1
如何编写循环来运行数据框的 t 检验？

我遇到了对数据框中存储的某些数据运行 t 检验的问题我知道如何一一做但效率很低请问如何写一个循环来实现呢例如我在testData中获取了数据 testData lt dput testData structure list Lab

随机推荐

VsCode下载，使用国内镜像秒下载

还在因为vscode官方下载慢而头疼嘛 xff0c 按这个步骤来直接起飞兄弟萌首先进入vscode官方网站然后选择对应版本下载然后进入浏览器下载页面复制下载链接粘贴到地址栏将地址中的 stable前换成vscode cdn azure
Ubuntu常见问题 | 解压中文乱码问题

文章目录环境复现BUG原因解决环境 Ubuntu16 04LTS 复现右击 zip压缩文件左击提取到此处 BUG 解压出来的文件名文件夹名只要有中文 xff0c 中文就会变成乱码原因 Windows下的中文编码规则与Linux下
神经网络拟合曲线及讨论

神经网络拟合曲线及讨论问题说明神经网络能否拟合x 2 43 y 2 61 100在第一象限的曲线 xff1f 设计思路第一象限的曲线方程如下所示 xff1a y 61 100
Flask最基本示例

1 app span class token comment coding utf 8 span span class token keyword from span flask span class token keyword impor
linux桌面系统连接linux服务器，vncviewer报：“unable connect to socket :拒绝连接（111）”的解决办法

linux桌面系统连接linux服务器 xff0c vncviewer报 xff1a unable connect to socket 拒绝连接 xff08 111 xff09 的解决办法首先ping一下服务器地址看看通不通 xff0c
解决Mybatis查询方法selectById()主键不一致问题

Mybatis plus的通用mapper为我们封装了很多方法 xff0c 我们只需要将interface集成BaseMapper就可以在BaseMapper中分装了一个方法 61 selectById selectById 这个方法是根
“Parameter ‘XXX‘ not found. Available parameters are [arg1, arg0, param1,...“解决的办法

如下所示 xff0c service层是这样 xff0c dao层如果也这样写 xff0c 会报错解决方法 xff1a 首先要引入 Param jar包 xff1b 接着dao层这样写就可以啦
python输出为txt文件

输出到文件print函数 print函数除了打印到控制台 xff0c 同时还提供了输出到文件的功能 xff0c 其默认输出文件是sys stdout 意味着控制台输出 f 61 open 39 log txt 39 39 w 39 for
Git--log 查看提交历史

注意 xff1a Author 表示对代码做出修改的人 Date 表示对代码做出修改的时间 Commit 表示提交代码的人 xff0c CommitDate 表示提交代码的时间实际工作中 xff0c Author 并不一定就是Commit
安装Docker Desktop报错WSL 2 installation is incomplete，启动docker 出现WSL 2 installation is incomplete。

报错描述我们安装Docker Desktop的时候他会问我们是否需要使用WSL2 基于Windows的Linux子系统如果我们不适用就会使用Hyper v虚拟机运行不过相比于虚拟机子系统在性能方面更加出色在我们选择使用WSL2
mysql中出现Specified key was too long； max key length is 767 bytes，需要innodb_large_prefix设置步骤

mysql gt set global innodb file format 61 BARRACUDA Query OK 0 rows affected 0 00 sec mysql gt set global innodb large p
CVPR 2019关于Attention导读与Bottom-up代码修改（已附github链接）

本文为随笔 xff0c 主要记录个人寒假的一些工作情况 xff0c 其中2月份因病休息了两周 Contents Things to doOverview of the papers in CVPR 2019The change of bot
Windows：配置多网卡路由表，WiFi和网线内外网不通策略

怎么做 xff1f 1 没有插网线 xff0c 没有连 WIFI 时 xff0c 笔记本路由表 xff1a 查看路由表执行 xff1a route print xff08 或者 route print 4 xff09 重点看 IPv4 路由
迁移学习vs元学习，二者有什么差异？

1 导言在机器学习领域 xff0c 有许多技术术语包含学习一词其中一些是深度学习强化学习有监督或无监督学习主动学习元学习和迁移学习尽管学习这个词很常见 xff0c 但这些术语却截然不同它们之间唯一的共同点都出现在机器
python之while循环、无限循环

Python中while语句的一般形式 xff1a while 判断条件 xff1a 语句同样需要注意冒号和缩进另外 xff0c 在 Python 中没有 do while 循环下面带来的例子是使用while计算1到100之和 a 6
JVM虚拟机概述

JVM虚拟机是一个抽象机器 xff0c 提供一个可以执行Java字节码的运行环境规范 xff0c JVM可以用于许多硬件和软件平台 JVM 一 JVM是什么 xff1f 二 JVM虚拟机可以做什么 xff1f JVM虚拟机内部体系结构类加载
单点登陆CAS使用Mysql数据库中用户和密码进行验证

1 首先准备两台服务器进行测试服务器IP功能Tomcat CAS Server192 168 73 146运行Tomcat和CAS服务器Mysql Server192 168 73 147运行Mysql数据库 2 启动Mysql数据 3
MATLAB和ROS联合仿真篇（从MATLAB获取ROS信息）

文章目录预备知识查看速度 xff0c 位置信息读取雷达信息读取里程计信息预备知识 ROS 部分 rosbag命令 xff1a 在 ROS 系统中 xff0c 可以使用 bag 文件来保存和恢复系统的运行状态 xff0c 比如录制雷
在同一个二维坐标系绘制出一元二次曲线, sin正弦、cos余弦曲线（numpy,matplotlib.pyplot ）

一元二次曲线 sin正弦 cos余弦曲线 xff08 numpy matplotlib pyplot xff09 import numpy import matplotlib span class hljs preprocessor pyp
拥有不同列的DataFrame的行连接 concat 函数

from pandas import read csv concat Series DataFrame span class hljs comment DataFrame span s1 61 read csv span class hlj

拥有不同列的DataFrame的行连接 concat 函数

output

源码

拥有不同列的DataFrame的行连接 concat 函数 的相关文章

随机推荐

热门标签

拥有不同列的DataFrame的行连接 concat 函数的相关文章