Data manipulation involves preparing large data sets into a form required for statistical analysis. When you have a large data set most of it can be unrelated to what you are trying to accomplish. Data manipulation can be quite complex but very important for achieving the goals of the analysis.
Data manipulation covers a wide variety of tasks, such as:
- getting data from text files, spreadsheets, databases and other sources and inputting them into an appropriate statistical package
- manipulating date/time data and character manipulation
- aggregating data and reshaping data
SOME RECOMMENDED RESOURCES
- Cody, R. 2008. Cody’s Data Cleaning Techniques using SAS, 2nd. Edition, SAS Institute.
- Spector, P. 2008. Data Manipulation with R. Springer, New York.
- Nolan, D. and Temple Lang, D. 2014. XML and Web Technologies for Data Sciences with R. Springer, New York.
