How to use sapply to divide different rows for each column
How to use sapply to divide different rows for each column
I'm a bit new at R, but I am trying to do something very simple using sapply because I will need to do it a lot. Say that you have many variables for 5 years, and want to divide the fifth row's values by the first row's for each of the columns at once.
a b c
184 20 55
100 32 563
18 12 88
5 99 52
32 36 22
So far I can either do it one by one:
df$a<-(df[5,]$a/df[1,]$a)
Or if I try to use sapply:
df2<-data.frame(sapply(names(df)[-1], function(x)
(df[x]/df[x])
))
The problem is that I don't know how to denote the rows with the sapply so above I'm just dividing vars by themselves. What's the quickest way of doing this? Thanks!
3 Answers
3
If you have mixed type columns here is a dplyr
approach
dplyr
library(dplyr)
df %>% mutate_if(is.numeric, function(x) replace(x, length(x), x[length(x)] / x[1]))
# a b c d e
#1 184.000000 20.0 55.0 a A
#2 100.000000 32.0 563.0 b B
#3 18.000000 12.0 88.0 c C
#4 5.000000 99.0 52.0 d D
#5 0.173913 1.8 0.4 e E
# Sample data with mixed numeric and character columns
df <- read.table(text =
"a b c
184 20 55
100 32 563
18 12 88
5 99 52
32 36 22 ", header = T)
df <- cbind(df, d = letters[1:5], e = LETTERS[1:5])
This is something that could be good as a function:
library(dplyr)
div_row <- function(data, numerator, denominator)
data %>% mutate_if(is.numeric, funs(if_else(row_number() == numerator, .[numerator]/.[denominator], .)))
df %>% div_row(5,1)
# a b c d
# 1 184 20 55 a
# 2 100 32 563 a
# 3 18 12 88 c
# 4 5 99 52 e
# 5 0.174 1.8 0.4 t
df %>% div_row(2,1)
# a b c d
# 1 184 20 55 a
# 2 0.543 1.6 10.2 a
# 3 18 12 88 c
# 4 5 99 52 e
# 5 32 36 22 t
For this task you don't need sapply
and instead do
sapply
df[5, ] <- df[5, ] / df[1, ]
df
# a b c
#1 184.000000 20.0 55.0
#2 100.000000 32.0 563.0
#3 18.000000 12.0 88.0
#4 5.000000 99.0 52.0
#5 0.173913 1.8 0.4
Referring to @Mako212's comment, if your data contains non-numeric columns you could create a logical vector first that is TRUE
at positions where you're data contains numeric columns. Use it for column subsetting and then do the operation.
TRUE
col_idx <- sapply(df, is.numeric)
df[5, col_idx] <- df[5, col_idx] / df[1, col_idx]
This will throw an error if you have non-numeric columns, a simple way to avoid that would be to select your numeric columns
iris[1,1:4]/iris[5,1:4]
.– Mako212
Aug 7 at 21:31
iris[1,1:4]/iris[5,1:4]
Yes, but OP does not mention non-numeric columns and writes "for each of the columns at once".
– markus
Aug 7 at 21:32
Thank you! This seemed easiest for me, and i ended up selecting which columns are numeric because I was getting an error with this bit col_idx <- sapply(iris, is.numeric). It would be nice to not have to specify tho.
– Mia
Aug 20 at 16:20
@Mia Thanks for the reply. if the name of your data set is
df
, then do col_idx <- sapply(df, is.numeric)
. I used the iris
data for illustration only since you didn't mention non-numeric columns in your question.– markus
Aug 20 at 21:05
df
col_idx <- sapply(df, is.numeric)
iris
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
Mia, please consider to accept one of the answers if any solved your issue.
– markus
Aug 10 at 7:21