我想对两个条件之间的连续行值求和。这是我的资料框:
df = pd.DataFrame({'A': ["yes","no","no","no","no","yes","yes","no","no","no","yes"],'B':["no","no","no","no","yes","yes","no","no","no","yes","yes"],'C':[2,5,1,4,6,13,7,8,3,9,1]},index=[0, 1, 2, 3,4,5,6,7,8,9,10])
事实上,当A="yes"
和时B="no"
,我想开始添加行值直到A="no"
和B="yes"
。我想得到以下结果:
df1 = pd.DataFrame({'A': ["yes","yes"],'B':["no","no"],'C':[12,18]},index=[0, 6])
uj5u.com热心网友回复:
您可以使用布尔掩码两次来获得所需的结果。
在第一个掩码中,您过滤需要查找其总和的行。在第二个中,您找到总和。
mask = df.apply(lambda x: 1 if (x['A']=='yes')&(x['B']=='no') else (-1 if (x['A']=='no')&(x['B']=='yes') else 0), axis=1).cumsum().astype(bool)
mask2 = df[mask].apply(lambda x: 1 if (x['A']=='yes') & (x['B']=='no') else 0, axis=1)
out = df[mask][mask2.astype(bool)]
out['C'] = df[mask].groupby(mask2.cumsum())['C'].sum().to_numpy()
输出:
A B C
0 yes no 12
6 yes no 18
0 评论