Spark Optimization with Scala
Please rate the course
Course short description
Go fast or go home. Learn the ins and outs of Spark and get the best out your code.
Why the $&*(# is my job running so slow?
I've had my fair share of pain with Spark, and if you're reading this, you've probably seen this too: you run a 4-line job on a gig of data, with two innocent joins, and it takes a bloody hour to run. Or another one: you have an hour long job which was progressing smoothly, until the task 1149/1150 where it hangs, and after two more hours you decide to kill it because you don't know if it's you or a bug in Spark. Usually, PIBKAC - problem is between keyboard and chair - but in desperation, the only idea you have is turn it off and on again.
Then you go like, "hm, maybe my Spark cluster is too small, let me bump some CPU and mem". Then... same thing. Amazon's probably laughing now and you're paying for it. So this has to be the million dollar question.
This is the only course on the web on Spark optimization. With the techniques you learn here you will save yourself time, headaches and money.