RDD

Resilient Distributed Datasets (RDD) is a fundamental data structure of Spark. It is an immutable distributed collection of objects. RDDs have actions, which return values, and transformations, which return pointers to new RDDs.