In order to explain, Lets create a
dataframe with 3 columns
spark-shell --queue= *;To adjust logging level use sc.setLogLevel(newLevel).Welcome to____ __/ __/__ ___ _____/ /___\ \/ _ \/ _ `/ __/ '_//___/ .__/\_,_/_/ /_/\_\ version 1.6.0Spark context available as scSQL context available as sqlContext.scala> val sqlcontext = new org.apache.spark.sql.SQLContext(sc)sqlcontext: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext@4f9a8d71scala> import org.apache.spark.sql.Columnscala> val BazarDF = Seq(("Veg", "tomato", 1.99),("Veg", "potato", 0.45),("Fruit", "apple", 0.99),("Fruit", "pineapple", 2.59)).toDF("Type", "Item", "Price")BazarDF: org.apache.spark.sql.DataFrame = [Type: string, Item: string, Price: double]
Lets see how to add 3 new columns into
this dataframe with dummy values.
One way of doing this is
scala> var BazarWithColumnDF = BazarDF.withColumn("Retailer",lit("null").as("StringType")).withColumn("Quantity",lit(0.0).as("DoubleType"))BazarWithColumnDF: org.apache.spark.sql.DataFrame =[Type: string, Item: string, Price: double, Retailer: string, Quantity: double]
We can use the same method using
foldLeft aswell
Now we need to define a list with new columnName to do this.
scala> var ColNameWithDatatype: List[(String, Column)] = List() ColNameWithDatatype: List[(String, org.apache.spark.sql.Column)] = List() scala> ColNameWithDatatype = List(("Retailer", lit("null").as("StringType")), ("Quantity", lit(0.0).as("DoubleType"))) ColNameWithDatatype: List[(String, org.apache.spark.sql.Column)] = List((Retailer,null AS StringType#91), (Quantity,0.0 AS DoubleType#92)) scala> var BazarWithColumnDF1 = ColNameWithDatatype.foldLeft(BazarDF) { (tempDF, colName) => | tempDF.withColumn(colName._1, colName._2) | } BazarWithColumnDF1: org.apache.spark.sql.DataFrame = [Type: string, Item: string, Price: double, Retailer: string, Quantity: double]scala> BazarWithColumnDF1.show()+-----+---------+-----+--------+--------+| Type| Item|Price|Retailer|Quantity|+-----+---------+-----+--------+--------+| Veg| tomato| 1.99| null| 0.0|| Veg| potato| 0.45| null| 0.0||Fruit| apple| 0.99| null| 0.0||Fruit|pineapple| 2.59| null| 0.0|+-----+---------+-----+--------+--------+