rdd = sc.parallelize([1, 2, 3, 4, 5])
map() and flatMap() in PySpark?map() transformation in PySpark applies a function to each element of the RDD and returns a new RDD with the results. The flatMap() transformation can return multiple elements for each input element and flattens them into a single list.map() or filter(). Actions trigger the execution of transformations to return results to the driver program, such as count() or collect().join() function on DataFrames. For example, to perform an inner join:df1.join(df2, df1.id == df2.id, 'inner')