当前位置：首页技术文章正文

如果行符合条件，则在 R 中为 TRUE 否则为 FALSE | 珊瑚贝

01-05 技术文章 216

If row meets criteria, then TRUE else FALSE in R

我有如下嵌套数据：

1
2
3
4
5
6
7
8
9
10

ID Date Behavior
1 1 FALSE
1 2 FALSE
1 3 TRUE
2 3 FALSE
2 5 FALSE
2 6 TRUE
2 7 FALSE
3 1 FALSE
3 2 TRUE

我想创建一个名为 counter 的列，其中对于每个唯一的 ID，计数器将一个添加到下一行，直到 Behavior = TRUE

我期待这个结果：

1
2
3
4
5
6
7
8
9
10

ID Date Behavior counter
1 1 FALSE 1
1 2 FALSE 2
1 3 TRUE 3
2 3 FALSE 1
2 5 FALSE 2
2 6 TRUE 3
2 7 FALSE
3 1 FALSE 1
3 2 TRUE 2

最后，我想提取每个唯一 ID 发生观察的最小值 counter。但是，我在为当前的 counter 问题开发解决方案时遇到了麻烦。

非常感谢任何和所有的帮助！

我想在每个唯一的 ID 数组中创建一个计数器，并从那里最终提取行级信息 – 问题是平均需要多长时间才能达到 TRUE…
你可以做 library(data.table); setDT(df)[, counter := c(seq_len(which(Behavior)), rep(NA, .N – which(Behavior))), ID] 但我会选择@NPEs 解决方案

I’d like to create a counter within each array of unique IDs and from there, ultimately pull the row level info – the question is how long on average does it take to reach a TRUE

我感觉这里可能存在 XY 问题。您可以直接回答后一个问题，如下所示：

1
2
3

> library(plyr)
> mean(daply(d, .(ID), function(grp)min(which(grp$Behavior))))
[1] 2.666667

(其中 d 是您的数据框。)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

do.call(rbind, by(df, list(df$ID), function(x) {n = nrow(x); data.frame(x, Counter = c(1:(m<-which(x$Behavior)), rep(NA, n-m)))}))

ID Date Behavior Counter
1.1 1 1 FALSE 1
1.2 1 2 FALSE 2
1.3 1 3 TRUE 3
2.4 2 3 FALSE 1
2.5 2 5 FALSE 2
2.6 2 6 TRUE 3
2.7 2 7 FALSE NA
3.8 3 1 FALSE 1
3.9 3 2 TRUE 2

df = read.table(text =”ID Date Behavior
1 1 FALSE
1 2 FALSE
1 3 TRUE
2 3 FALSE
2 5 FALSE
2 6 TRUE
2 7 FALSE
3 1 FALSE
3 2 TRUE”, header = T)

这就是我想要做的 – 但是我得到一个 invalid times argument 错误。这种错误有什么常见的嫌疑人吗？
我无法复制该错误，但我猜它指的是 `rep.

这里是一个 dplyr 解决方案，它为每个 ID 中的每个 TRUE 找到行号：

1
2
3
4
5

library(dplyr)
newdf <- yourdataframe %>%
group_by(ID) %>%
summarise(
ftrue = which(Behavior))

此解决方案仅在每个 ID 出现一次 TRUE 时才有效 – 之后，您会收到 expecting a single value 错误。

来源：https://www.codenong.com/27533338/

微信公众号

手机浏览(小程序)

0

分享到：