GrAPE

Graphical Assistant for Prerequisite Enrollment

DSC department

DSC 232R. Big Data Analytics Using Spark (4 units)

Link to catalog page: https://catalog.ucsd.edu/courses/DSC.html#dsc232r

Description

This course covers techniques for achieving scalability in data analysis, using tools such as MapReduce, Hadoop, and Spark. Topics include programming Spark using PySpark; identifying the computational tradeoffs in a Spark application; performing data loading and cleaning using Spark and Parquet; modeling data through statistical and machine learning methods, and mitigating bottlenecks that arise in massive parallel computations by using the Spark framework. This is a distance education course. Prerequisites: DSC 255R. Restricted to major code DS77. All other students with graduate standing may be considered as space permits.

Prerequisite courses

Loading...

Successor courses

No courses have DSC 232R as a prerequisite.