Let us create Example DataFrame to explain how to select List of columns of type "Column" from a dataframe
spark-shell --queue= *;
To adjust logging level use sc.setLogLevel(newLevel).
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 1.6.0
Spark context available as sc
SQL context available as sqlContext.
scala> val sqlcontext = new org.apache.spark.sql.SQLContext(sc)
sqlcontext: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext@4f9a8d71
scala> val BazarDF = Seq(
| ("Veg", "tomato", 1.99),
| ("Veg", "potato", 0.45),
| ("Fruit", "apple", 0.99),
| ("Fruit", "pineapple", 2.59),
| ("Fruit", "apple", 1.99)
| ).toDF("Type", "Item", "Price")
BazarDF: org.apache.spark.sql.DataFrame = [Type: string, Item: string, Price: double]
scala> BazarDF.show()
+-----+---------+-----+
| Type| Item|Price|
+-----+---------+-----+
| Veg| tomato| 1.99|
| Veg| potato| 0.45|
|Fruit| apple| 0.99|
|Fruit|pineapple| 2.59|
|Fruit| apple| 1.99|
+-----+---------+-----+
Create a List[Column] with column names.
scala> var selectExpr : List[Column] = List("Type","Item","Price")
<console>:25: error: not found: type Column
var selectExpr : List[Column] = List("Type","Item","Price")
^
If you are getting the same error Please take a look into this page .
Using : _* annotation select the columns from dataframe.
scala> var dfNew = BazarDF.select(selectExpr: _*)
dfNew: org.apache.spark.sql.DataFrame = [Type: string, Item: string, Price: double]
scala> dfNew.show()
+-----+---------+-----+
| Type| Item|Price|
+-----+---------+-----+
| Veg| tomato| 1.99|
| Veg| potato| 0.45|
|Fruit| apple| 0.99|
|Fruit|pineapple| 2.59|
|Fruit| apple| 1.99|
+-----+---------+-----+
You are Done!
I have also explained How to select multiple columns from a sparkdata frame using List[String]
I have also explained How to select multiple columns from a sparkdata frame using List[String]
I feel really happy to have seen your post and look forward to so many more interesting post reading here. Thanks once more for all the details.
ReplyDeleteData Science Training in Hyderabad
Thanks for sharing How to select multiple columns from a spark data frame using List this article.
ReplyDeletekeep sharing
best training institute in bangalore
full stack developer course
mean Stack Development Training
Hello! This is my first visit to your blog! We are a team of volunteers and starting a new initiative in a community in the same niche. Your blog provided us useful information to work on. You have done an outstanding job.
ReplyDeleteAWS Training in Hyderabad
AWS Course in Hyderabad
Despite its short length, your article gives a decent overview of the storyline and presents concepts well. This is an amazing post, thank you. If you need to resolve any errors in quickbooks you can downlaod QuickBooks File Doctor
ReplyDeleteNice blogs.
ReplyDeleteQuickbook database server managerQuickbook database server manager was the best way to send or receive the data or files related to the organization. due to this, it was the best and secure method of transferring the file from one computer to another computer.
ReplyDeleteI really like reading a post that can make people think. Also, thank you for permitting me to comment.
Hello friends, I am Ronald Frankeliene, an education expert. In academics, I am here to share information about online Dissertation Help . If you are one of them, who is facing dissertation-related problems, contact us for help. I have a team of online assignment experts that can help those students who need help in their subjects.
ReplyDeleteI work as a certified technical expert for the QuickBooks refresher tool, handling all processes within the tool. I also help users if they find any difficulties and problems with the tool. It also provides user manual assistance.
ReplyDelete