GrAPE

Graphical Assistant for Prerequisite Enrollment

DSC department

DSC 232R. Big Data Analytics Using Spark (4 units)

Link to catalog page: https://catalog.ucsd.edu/courses/DSC.html#dsc232r

Description

This course covers techniques for achieving scalability in data analysis, using tools such as MapReduce, Hadoop, and Spark. Topics include programming Spark using PySpark; identifying the computational tradeoffs in a Spark application; performing data loading and cleaning using Spark and Parquet; modeling data through statistical and machine learning methods, and mitigating bottlenecks that arise in massive parallel computations by using the Spark framework. This is a distance education course. Prerequisites: DSC 255R. Restricted to major codes DS78 and DS79. All other students with graduate standing may be considered as space permits.

Prerequisite courses

Loading...

Successor courses

No courses have DSC 232R as a prerequisite.