package com.sample.wc import org.apache.spark.sql.SparkSession import org.apache.commons.io.FileUtils import org.apache.commons.io.filefilter.WildcardFileFilter import java.io.File object WordCount { def main(args: Array[String]): Unit = { // Creating the spark object val spark = SparkSession.builder().master("local").appName("Word Count").getOrCreate() //reading the text file and create the RDD val data = spark.read.textFile(args(0)).rdd //Split the line in the text file with space val wordsSplits = data.flatMap(lines => lines.split(" ")) //Map each word to word,1, to ease the counting val wordMaptoOne = wordsSplits.map(value => (value, 1)) //Count each word val count = wordMaptoOne.reduceByKey(_ + _) //Delete the output file, if already exists FileUtils.deleteDirectory(new File(args(1))) //Save the output file as text count.saveAsTextFile(args(1)) //Stop the spark object spark.stop() } }Command to execute the Jar file // bin/spark-submit --class com.sample.wc.WordCount WordCounts.jar text.txt output
Saturday, April 21, 2018
My first Spark Program for Word count using Scala
Subscribe to:
Post Comments (Atom)
Thank you for providing useful content Big data hadoop online Course Hyderabad
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteGreat post! I am actually getting ready to across this information, It’s very helpful for this blog.Also great with all of the valuable information you have Keep up the good work you are doing well.
ReplyDeleterpa training in Chennai | rpa training in velachery
rpa training in tambaram | rpa training in sholinganallur
Your story is truly inspirational and I have learned a lot from your blog. Much appreciated.
ReplyDeleteData Science training in kalyan nagar | Data Science training in OMR
Data Science training in chennai | Data science training in velachery
Data science training in tambaram | Data science training in jaya nagar
The post is written in very a good manner and it entails many useful information for me. I am happy to find your distinguished way of writing the post. Now you make it easy for me to understand and implement the concept.
ReplyDeletejava training in chennai | java training in bangalore
java online training | java training in pune
I wanted to thank you for this great read!! I definitely enjoying every little bit of it I have you bookmarked to check out new stuff you post.is article.
ReplyDeletepython training Course in chennai
python training in Bangalore
Python training institute in kalyan nagar
Does your blog have a contact page? I’m having problems locating it but, I’d like to shoot you an email. I’ve got some recommendations for your blog you might be interested in hearing.
ReplyDeleteAmazon Web Services Training in Pune | Best AWS Training in Pune
AWS Training in Chennai | Best Amazon Web Services Training in Chennai
Amazon Web Services Training in Chennai |Best AWS Training in Chennai
Amazon Web Services Online Training | Online Amazon Web Services Certification Course Training
Amazon Web Services Training in Pune | Best AWS Training in Pune
Great post!
ReplyDeleteThanks for posting it was really helpful!
Big data training in Bangalore
Great article,thank you for sharing this awesome blog with us.
ReplyDeletethank you so much,keep updating...
big data hadoop course
hadoop admin online course
simple hallo world program in spark, simply explained using scala language.
ReplyDeleteThanks Pranav to share ur knowledge. i request pls share different examples . for the last three years no article related to bigdata.
Thanks & Regards
Venu
spark training in Hyderabad