Monday 5 October 2020

How to write dataframe output to a single file with a specific name using Spark


Spark is designed to write out multiple files in parallel. So there may be cases where we need to merge all the part files, remove the success/commit files and write the content to a single file.

This blog helps you to write spark output to a single file.

Using  df.coalesce(1) we can write data to a single file, 

 result_location = "dbfs:///mnt/datalake/unmesha/output/"   df.coalesce(1).write.format("csv").options(header='true').mode("overwrite").save(result_location)

but still you will see _success files.



 


This solution - adding coalesce isn’t sufficient when you want to write data to a file with a specific name.

We are going to achieve this using dbutils
  result_location = "dbfs:///mnt/datalake/unmesha/output/"
     df.coalesce(1).write.format("csv").options(header='true').mode("overwrite").save(result_location)
  files = dbutils.fs.ls(result_location)
  csv_file = [x.path for x in files if x.path.endswith(".csv")][0] 
  dbutils.fs.mv(csv_file, result_location.rstrip('/') + ".csv") 
  dbutils.fs.rm(result_location, recurse = True)
 Above snippet helps you  to write dataframe output to a single file with a specific name.



  

6 comments:

  1. Live Seacoin Price from all markets and SEA coin market Capitalization. Stay up to date with the latest SEA price movements and forum discussion. Check out our snapshot charts and see when there is an opportunity to buy or sell.

    ReplyDelete
  2. Fast-track your data analytics and machine learning course with guaranteed placement opportunities. Most extensive, industry-approved experiential learning program ideal for future Data Scientists.

    ReplyDelete
  3. Great article. Your blogs are unique and simple that is understood by anyone.

    BCom First Year Time Table PDF

    ReplyDelete
  4. Get one of the best import export data provider for Mexico Import and Export Data. For more information visit our website and complete details about our import export data services.
    Mexico Import Data

    ReplyDelete
  5. Thanks for sharing this blog.
    Discover exquisite resorts for team outings in Ooty. Tailored amenities, stunning locales, and a perfect blend of relaxation and team-building activities.

    ReplyDelete