Read and process CSV Files
#
ObjectiveRead and analyse a csv file containing sales data for a grocery store.
#
Reading the CSV fileCopy the below data and write to a file named orders.csv
and read the data
using spark.
sales.csv
Now, open the terminal and run spark-shell
to launch the interactive REPL for
running Spark programs.
Let's get a sneek peek of the dataset read, using the shows command that takes a parameter that defines the number of rows you want to fetch.
Now this gives weird column names as _c0, _c1 and even considers the header names in the csv as a row. In order to fix this, let's change the read option for this csv.
#
Get total orders on each dayBased on the given dataset, we will calculate the total orders the store received.