Tuesday 14 June 2022

Join through expression variable as on condition in databricks using PySpark

 Lets see how to join 2 table with a parameterized on condition in PySpark

Eg: I have 2 dataframes A and B and I want to join them with id,inv_no,item and subitem


onExpr = [(A.id == B.id) &
                    (A.invc_no == B.invc_no) & 
                    (A.item == B.item) & 
                    (A.subItem == B.subItem)] 

 dailySaleDF = A.join(B, onExpr, 'left').select([c for c in df.columns])



3 comments:

  1. Import Globals is one of the leading import export data provider for India, Brazil, USA, and Kenya. For more information visit our website.
    USA Export Data

    ReplyDelete
  2. Your article not only provided valuable content but also served as a source of inspiration, fostering a heightened sense of eagerness among readers to delve further into your writings
    <a href="//httpswww.bhutantalentdirectory.com/tech-and-programming>Bhutan coding talent</a>

    ReplyDelete