博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
R Programming week 3-Loop functions
阅读量:5368 次
发布时间:2019-06-15

本文共 5743 字,大约阅读时间需要 19 分钟。

Looping on the Command Line

Writing for, while loops is useful when programming but not particularly easy when working interactively on the command line. There are some functions which implement looping to make life easier

lapply: Loop over a list and evaluate a function on each elementsapply: Same as lapply but try to simplify the result

apply: Apply a function over the margins of an array

tapply: Apply a function over subsets of a vector mapply: Multivariate version of lapply

An auxiliary function split is also useful, particularly in conjunction with lapply

lapply

lapply takes three arguments: (1) a list X; (2) a function (or the name of a function) FUN; (3) other arguments via its ... argument. If X is not a list, it will be coerced to a list using as.list.

## function (X, FUN, ...)

## {

## FUN <- match.fun(FUN)

## if (!is.vector(X) || is.object(X))

## X <- as.list(X)

## .Internal(lapply(X, FUN))

## }

## <bytecode: 0x7ff7a1951c00>

## <environment: namespace:base>

The actual looping is done internally in C code.

lapply always returns a list, regardless of the class of the input.

x <- list(a = 1:5, b = rnorm(10))

lapply(x, mean)

 

x <- list(a = 1:4, b = rnorm(10), c = rnorm(20, 1), d = rnorm(100, 5)) lapply(x, mean)

 

> x <- 1:4 > lapply(x, runif)

lapply and friends make heavy use of anonymous function

> x <- list(a = matrix(1:4, 2, 2), b = matrix(1:6, 3, 2))

> x

$a

[,1] [,2]

[1,] 1 3

[2,] 2 4

$b

[,1] [,2]

[1,] 1 4

[2,] 2 5

[3,] 3 6

An anonymous function for extracting the first column of each matrix.

> lapply(x, function(elt) elt[,1])

$a

[1] 1 2

$b

[1] 1 2 3

sapply

> x <- list(a = 1:4, b = rnorm(10), c = rnorm(20, 1), d = rnorm(100, 5))

> lapply(x, mean)

apply

apply is used to a evaluate a function (often an anonymous one) over the margins of an array.

It is most often used to apply a function to the rows or columns of a matrix

It can be used with general arrays, e.g. taking the average of an array of matrices

It is not really faster than writing a loop, but it works in one line!

 

> str(apply)

function (X, MARGIN, FUN, ...)

X is an array

MARGIN is an integer vector indicating which margins should be “retained”.

FUN is a function to be applied

... is for other arguments to be passed to FUN

> x <- matrix(rnorm(200), 20, 10)

> apply(x, 2, mean)

[1] 0.04868268 0.35743615 -0.09104379

[4] -0.05381370 -0.16552070 -0.18192493

[7] 0.10285727 0.36519270 0.14898850

[10] 0.26767260

col/row sums and means

For sums and means of matrix dimensions, we have some shortcuts.

rowSums = apply(x, 1, sum)

rowMeans = apply(x, 1, mean)

colSums = apply(x, 2, sum)

colMeans = apply(x, 2, mean)

The shortcut functions are much faster, but you won’t notice unless you’re using a large matrix.

Other Ways to Apply

Quantiles of the rows of a matrix.

> x <- matrix(rnorm(200), 20, 10)

> apply(x, 1, quantile, probs = c(0.25, 0.75))

mapply

mapply is a multivariate apply of sorts which applies a function in parallel over a set of arguments.

> str(mapply)

function (FUN, ..., MoreArgs = NULL, SIMPLIFY = TRUE,USE.NAMES = TRUE)

FUN is a function to apply ... contains arguments to apply over MoreArgs is a list of other arguments to FUN.

SIMPLIFY indicates whether the result should be simplified

The following is tedious to type

list(rep(1, 4), rep(2, 3), rep(3, 2), rep(4, 1))

Instead we can do

Vectorizing a Function

> noise <- function(n, mean, sd) {

+ rnorm(n, mean, sd)

+ }

> noise(5, 1, 2)

[1] 2.4831198 2.4790100 0.4855190 -1.2117759

[5] -0.2743532

> noise(1:5, 1:5, 2)

[1] -4.2128648 -0.3989266 4.2507057 1.1572738

[5] 3.7413584

Instant Vectorization

> mapply(noise, 1:5, 1:5, 2)

Which is the same as

list(noise(1, 1, 2), noise(2, 2, 2), noise(3, 3, 2), noise(4, 4, 2), noise(5, 5, 2))

tapply

tapply is used to apply a function over subsets of a vector. I don’t know why it’s called tapply.

> str(tapply) function (X, INDEX, FUN = NULL, ..., simplify = TRUE)

X is a vector

INDEX is a factor or a list of factors (or else they are coerced to factors)

FUN is a function to be applied

... contains other arguments to be passed FUN

simplify, should we simplify the result?

Take group means.

 

> x <- c(rnorm(10), runif(10), rnorm(10, 1))

> f <- gl(3, 10)

> f

[1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3

[24] 3 3 3 3 3 3 3

Levels: 1 2 3

> tapply(x, f, mean)

1 2 3

0.1144464 0.5163468 1.2463678

Take group means without simplification.

> tapply(x, f, mean, simplify = FALSE)

$‘1‘

[1] 0.1144464

$‘2‘

[1] 0.5163468

$‘3‘

[1] 1.246368

Find group ranges.

> tapply(x, f, range)

$‘1‘

[1] -1.097309 2.694970

$‘2‘

[1] 0.09479023 0.79107293

$‘3‘

[1] 0.4717443 2.5887025

split

split takes a vector or other objects and splits it into groups determined by a factor or list of factors.

> str(split) function (x, f, drop = FALSE, ...)

x is a vector (or list) or data frame

f is a factor (or coerced to one) or a list of factors

drop indicates whether empty factors levels should be dropped

A common idiom is split followed by an lapply.

> lapply(split(x, f), mean)

Splitting a Data Frame

> library(datasets)

> head(airquality)

> s <- split(airquality, airquality$Month)

> lapply(s, function(x) colMeans(x[, c("Ozone", "Solar.R", "Wind")]))

> sapply(s, function(x) colMeans(x[, c("Ozone", "Solar.R", "Wind")]))

> sapply(s, function(x) colMeans(x[, c("Ozone", "Solar.R", "Wind")], na.rm = TRUE))

Splitting on More than One Level

> x <- rnorm(10)

> f1 <- gl(2, 5)

> f2 <- gl(5, 2)

Interactions can create empty levels.

> str(split(x, list(f1, f2)))

split

Empty levels can be dropped

 

> str(split(x, list(f1, f2), drop = TRUE))

List of 6

$ 1.1: num [1:2] -0.378 0.445

$ 1.2: num [1:2] 1.4066 0.0166

$ 1.3: num -0.355

$ 2.3: num 0.315

$ 2.4: num [1:2] -0.907 0.723

$ 2.5: num [1:2] 0.732 0.360

欢迎关注

转载于:https://www.cnblogs.com/jpld/p/4446804.html

你可能感兴趣的文章
感谢Leslie Ma
查看>>
几种排序方法
查看>>
查看数据库各表的信息
查看>>
第一阶段测试题
查看>>
第二轮冲刺第五天
查看>>
图片压缩
查看>>
Hadoop-2.6.5安装
查看>>
ES6思维导图
查看>>
第四周作业
查看>>
20151121
查看>>
线段重叠 (思维好题)
查看>>
Codeforces Round #413 C. Fountains (线段树的创建、查询、更新)
查看>>
SBuild 0.1.5 发布,基于 Scala 的构建系统
查看>>
WordPress 3.5 RC3 发布
查看>>
DOM扩展札记
查看>>
primitive assembly
查看>>
浅谈localStorage的用法
查看>>
Ad Exchange基本接口和功能
查看>>
Angular ui-router的常用配置参数详解
查看>>
软考知识点梳理--项目评估
查看>>