In order to explain, Lets create a
dataframe with 3 columns
spark-shell --queue= *;
To adjust logging level use sc.setLogLevel(newLevel).
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 1.6.0
Spark context available as sc
SQL context available as sqlContext.
scala> val sqlcontext = new org.apache.spark.sql.SQLContext(sc)
sqlcontext: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext@4f9a8d71
scala> import org.apache.spark.sql.Column
scala> val BazarDF = Seq(
("Veg", "tomato", 1.99),
("Veg", "potato", 0.45),
("Fruit", "apple", 0.99),
("Fruit", "pineapple", 2.59)
).toDF("Type", "Item", "Price")
BazarDF: org.apache.spark.sql.DataFrame = [Type: string, Item: string, Price: double]
Lets see how to add 3 new columns into
this dataframe with dummy values.
One way of doing this is
scala> var BazarWithColumnDF = BazarDF.withColumn("Retailer",lit("null").as("StringType"))
.withColumn("Quantity",lit(0.0).as("DoubleType"))
BazarWithColumnDF: org.apache.spark.sql.DataFrame =
[Type: string, Item: string, Price: double, Retailer: string, Quantity: double]
We can use the same method using
foldLeft aswell
Now we need to define a list with new columnName to do this.
scala> var ColNameWithDatatype: List[(String, Column)] = List() ColNameWithDatatype: List[(String, org.apache.spark.sql.Column)] = List() scala> ColNameWithDatatype = List(("Retailer", lit("null").as("StringType")), ("Quantity", lit(0.0).as("DoubleType"))) ColNameWithDatatype: List[(String, org.apache.spark.sql.Column)] = List((Retailer,null AS StringType#91), (Quantity,0.0 AS DoubleType#92)) scala> var BazarWithColumnDF1 = ColNameWithDatatype.foldLeft(BazarDF) { (tempDF, colName) => | tempDF.withColumn(colName._1, colName._2) | } BazarWithColumnDF1: org.apache.spark.sql.DataFrame = [Type: string, Item: string, Price: double, Retailer: string, Quantity: double]scala> BazarWithColumnDF1.show()
+-----+---------+-----+--------+--------+
| Type| Item|Price|Retailer|Quantity|
+-----+---------+-----+--------+--------+
| Veg| tomato| 1.99| null| 0.0|
| Veg| potato| 0.45| null| 0.0|
|Fruit| apple| 0.99| null| 0.0|
|Fruit|pineapple| 2.59| null| 0.0|
+-----+---------+-----+--------+--------+
Nice blog
ReplyDeleteiot courses in Bangalore
internet of things training course in Bangalore
internet of things course in Bangalore
Super helpful :) Thanks.
ReplyDeleteThanks for the information!
ReplyDeleteawesome blog you shared with us and its very helpful for my training institute candidates. keep update more techniques on spark online training Hyderabad
ReplyDeleteYou are doing a great job by sharing useful information about apache spark scala course. It is one of the post to read and imporove my knowledge in apache spark scala.apache spark scala training institute in bangalore is also one of the training institute which provides best training in apache spark scala, big data, data science and many other courses.
ReplyDeleteThanks for sharing information
ReplyDeletePiping Design Course
Piping Design Course in India
Piping Design Course in Hyderabad
Welding Inspector Course
Quality Management Course
Quality Management Course in india
Safety officer course