By-Processing - Part 2 (The summaryBy() Function).mp4

views comments

There are several different functions and methods available in R to compute statistics or, more generally, apply functions to groups of observations within a data set. The tapply() function is one such function, but there are others that are more flexible and powerful. These approaches produce results similar to those obtained in SAS when using a BY statement to generate results in each of several "by-groups". In this series of three videos we talk about three approaches for such "by-processing" in R.

In this "Part 2" video, we introduce the summaryBy() function from the doBy package. This function is similar to aggregate(), but it allows more than one function to be applied to blocks of a data frame, which is an advantage. In addition, it has an optional id= argument (similar to the ID statement in SAS' PROC MEANS) that allows variables to be copied to the resulting data frame without change, which is useful.

The summaryBy() function is more flexible than aggregate(), but perhaps not as elegant or easy to use as the method covered in Part 3 in this series of videos. I recommend watching this video to learn about it. But on the other hand, it is not necessary to know all three methods of by-processing covered in this series of videos. So you could skip this video and concentrate on "By-Processing - Part 3" if you strongly wish to save time.

Tags

By-Processing - Part 2 (The summaryBy() Function).mp4

Related Media