如何在R中汇总和传播资料-编程知识-白鹭情

我想总结我的资料，结果只有三列，如下所示： col_1 = name of the country, col_2 = percentage of 0s, col_3 = percentage of 1s,

这是资料：

country = rep(c("USA", "UK", "AUS", "ARM", "BEL", "BRA", "CHN", "EGY", "FIN", "FRA"),
              times = c(10, 5, 15, 10, 10, 10, 5, 15, 10, 10))
score= sample(c(0,1), replace=F)
dat = data.frame(country, score)

非常感谢。

uj5u.com热心网友回复：

使用 reshape2

library(reshape2)
dat2=dcast(dat,country~score,value.var="score")
dat2[,c("0","1")]=dat2[,c("0","1")]/rowSums(dat2[,c("0","1")])

   country         0         1
1      ARM 0.5000000 0.5000000
2      AUS 0.5333333 0.4666667
3      BEL 0.5000000 0.5000000
4      BRA 0.5000000 0.5000000
5      CHN 0.4000000 0.6000000
6      EGY 0.5333333 0.4666667
7      FIN 0.5000000 0.5000000
8      FRA 0.5000000 0.5000000
9       UK 0.4000000 0.6000000
10     USA 0.5000000 0.5000000

uj5u.com热心网友回复：

另一种可能的解决方案，基于tidyverse：

library(tidyverse)

country = rep(c("USA", "UK", "AUS", "ARM", "BEL", "BRA", "CHN", "EGY", "FIN", "FRA"),
              times = c(10, 5, 15, 10, 10, 10, 5, 15, 10, 10))
score= sample(c(0,1), replace=F)
dat = data.frame(country, score)

dat %>% 
  group_by(country) %>% 
  summarise(perc0s = 1-sum(score)/n(), perc1s=1-perc0s, .groups = "drop")

#> # A tibble: 10 × 3
#>    country perc0s perc1s
#>    <chr>    <dbl>  <dbl>
#>  1 ARM      0.5    0.5  
#>  2 AUS      0.467  0.533
#>  3 BEL      0.5    0.5  
#>  4 BRA      0.5    0.5  
#>  5 CHN      0.6    0.4  
#>  6 EGY      0.467  0.533
#>  7 FIN      0.5    0.5  
#>  8 FRA      0.5    0.5  
#>  9 UK       0.6    0.4  
#> 10 USA      0.5    0.5