Python 3 PySpark Library Example to Read CSV Files Using Spark Library Full Tutorial For Beginners

Python 3 PySpark Library Example to Read CSV Files Using Spark Library Full Tutorial For Beginners

 

Welcome folks today in this post we will be reading csv files using pyspark library in python. All the full source code of the application is shown below

 

 

Get Started

 

 

In order to get started you need to install the following library by using the pip  command as shown below

 

pip install pyspark

 

After installing this library you need to make an app.py file and copy paste the following code

 

app.py

 

from pyspark.sql import SparkSession

spark = SparkSession \
    .builder \
    .appName("how to read csv file") \
    .getOrCreate()
    
df = spark.read.csv('data.csv',header=True)

df.show()

 

 

So here in this above script we are importing the pyspark library we are reading the data.csv file which is present inside the root directory. Just make an data.csv file and copy paste the following code

See also  Python 3 Tkinter Console or Interpreter Window (Command Prompt) Widget to Display Commands Output in Window GUI Desktop App Full Project For Beginners

 

data.csv

 

name,class
gautam,11
sumit,12
john,23
sameer,29

 

 

 

So in the above csv file we have two columns namely name and class so now if you run the above python script by typing the below command which is shown below

 

python app.py

 

 

 

Leave a Reply