Wednesday, January 9, 2013

R COMMAND LINE LOOP FUNCTIONS

R COMMAND LINE LOOP FUNCTIONS



REVISED: Saturday, March 2, 2013




In this tutorial, you will receive an introduction to R command line loop functions.

I. R COMMAND LINE LOOP FUNCTIONS

The R command line can be used to do exploratory analysis using the following apply( ) functions:

apply( )        Apply Functions Over Array Margins.
by( )              Apply a Function to a Data Frame Split by Factors.
eapply( )     Apply a Function Over Values in an Environment.
lapply( )      Apply a Function over a List or Vector.
mapply( )   Apply a Function to Multiple List or Vector Arguments.
rapply( )     Recursively Apply a Function to a List.
sapply( )     Same as lapply( ) but simplifies using summaries.
tapply( )     Apply a Function Over a Ragged Array.

A. lapply( )

> str(lapply)
function (X, FUN, ...)  
>

lapply( ) is for lists of data.  lapply( ) loops over a list of objects or a vector and evaluates a function on each element of the list or vector and always returns a list. lapply( ) can contain anonymous functions, which you create, that only exist within the context of lapply( ).

B. sapply( )

> str(sapply)
function (X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE)
>


sapply( ) same as lapply( ) but tries to simplify the result of lapply( )  into an array of data.

C. apply ( )

> str(apply)
function (X, MARGIN, FUN, ...)
>

apply( ) applies a function over the margins of an array (the rows or columns); e.g.:

rowSums   = apply(x, 1, sum)
rowMeans = apply(x, 1, mean)
colSums    = apply(x, 2, sum)
colMeans  = apply(x, 2, mean)

For quantiles of the rows of a matrix you could use:

x  <-  matrix(rnorm(50),  nrow=10,  ncol=5)
> y <- apply(x, 1, quantile, probs=c(0.40,0.60))
> y
          [,1]        [,2]        [,3]
40% -0.1962752 -0.67482345 -0.04080054
60%  0.4259380 -0.06988917  0.05005659
          [,4]      [,5]       [,6]
40% -0.1304303 -1.479721 -0.3722651
60%  0.7390716 -1.399231 -0.1131933
          [,7]      [,8]       [,9]
40% -0.3533020 0.3526703 -0.5351401
60%  0.1227106 0.6044279 -0.1831993
         [,10]
40% -0.2753252
60%  0.1582800
>

D. tapply( )

tapply( ) is basically split( ) + lapply( ). You use tapply( ) when you want a function to act on subsets of the input vector that are defined by a factor.

> str(tapply)
function (X, INDEX, FUN = NULL, ..., simplify = TRUE)


> str(gl)
function (n, k, length = n * k, labels = 1:n, ordered = FALSE)
>   

tapply( ) applies a function over subsets of a vector.

> x <- c(rnorm(5), runif(5), rnorm(5, 1))
> f <- gl(3,5)  #3 levels, each level repeated 5 times.
> tapply( x, f, mean)

E. mapply( )

> str(mapply)
function (FUN, ..., MoreArgs = NULL, SIMPLIFY = TRUE, USE.NAMES = TRUE)


mapply( ) multivariate version of lapply

F. split( )

> str(split)
function (x, f, drop = FALSE, ...)  
>

split( ) splits objects into sub-pieces and is used in conjunction with lapply( ) and sapply( ). split( ) always returns a list.

In this tutorial, you have received an introduction to R command line loop functions.


Elcric Otto Circle







-->




-->




-->














How to Link to My Home Page

It will appear on your website as:

"Link to: ELCRIC OTTO CIRCLE's Home Page"




No comments:

Post a Comment