Spark Forum - Data Science

Spark Forum - Data Science

Share

This is page is about spark community service. To learn spark of java

Photos from Mission Success's post 08/09/2023
01/02/2023

Spark basic interview questions in a mock interview.

1) Why you shifted from map reduce development to spark development?
2) How Spark Engine is different from Hadoop Map Reduce engine?
3) What are the steps for spark jobs optimization?
4) What is executor and executor core? Reference in terms of process & threads
5) How to you identify that your hive script is slow?
6) When do we use partitioning and bucketing in hive?
7) Small file problem in hive ? ---> Skewiness
😎 How do you improve high cardinality issue in dataset? In resect of Hive.
9) How do you care code merging with other teams, explain your development process?
10) Again, Small files issue in Hadoop ?
11) Metadatasize of hadoop ?
12) How spark is differentiated from Map Reduce?
13) In a class having 3 fields name,age,salary & you are creating series of objects from this class? How do you compare the object ----(I didn't got the question exactly)
14) Scala : what is === in joins conditions? What does it means?

29/01/2021

Why spark?? What is spark? Why is it beneficial than Hadoop..?

Spark is an in memory processing tool. So the processing heavy data with an ease. In Hadoop or some big data tools we use some external file systems to process the data. But in spark it won't be the case. Spark API are supported by three languages. Java, scala and python.

26/01/2021

RDD is fundamental data structure of spark. It is an immutable distributed data collection of objects. Each dataset in RDD is divided into logical partitions which may computed in different nodes of a cluster.

Want your school to be the top-listed School/college in Hyderabad?

Click here to claim your Sponsored Listing.

Location

Category

Telephone

Website

Address


Hyderabad