4 - Advanced R
Objectives
- An Introduction to
data.table
- An Introduction to
dplyr
- A Brief Overview of Parallel Computing with R, and some Big Data considerations
- Getting basic understanding of single vs multithread computing, parallelism, benchmarking
Topics
data.table
- The
[i, j, by]
idiom
fread
and fwrite
dcast
and melt
- examples / case study in appendix
dplyr
filter
, select
, mutate
- pipe (now native since R 4.1.0)
- examples / case study
- Parallel Computing with R which is single threaded
- It math libraries may not be
- parallel package as perfect start: mclapply, parLapply
- simple benchmarking
- big data / external memory / bigmemory
- Efficient R Programming (Gillespie/Lovelace book)
- Chapter 3: Efficient Programming
- Chapter 5: Efficient I/O
- Chapter 6: Efficient Data Carpentry
- Chapter 7: Efficient Optimization
- Efficient data wrangling
Core Material
Lecture Slides
Lecture Videos
Additional Resources