Choose a topic to test your knowledge and improve your Apache Hadoop skills
What does commodity Hardware in Hadoop world mean?
Which of the following are NOT big data problem(s)?
What does βVelocityβ in Big Data mean?
The term Big Data first originated from:
Which of the following Batch Processing instance is NOT an example of Big Data Batch Processing?
Which of the following are example(s) of Real Time Big Data Processing?
Sliding window operations typically fall in the category of__________________.
What is HBase used as?
What is Hive used as?
Which of the following are NOT true for Hadoop?
Which of the following are the core components of Hadoop?
Hadoop is open source.
Hive can be used for real time queries.
What is the default HDFS block size?
What is the default HDFS replication factor?
Which of the following is NOT a type of metadata in NameNode?
Which of the following is/are correct?
The mechanism used to create replica in HDFS is____________.
NameNode tries to keep the first copy of data nearest to the client machine.
Where is the HDFS replication factor controlled?
Which of the following Hadoop config files is used to define the heap size?
Which of the following is not a valid Hadoop config file?
Read the statement: NameNodes are usually high storage machines in the clusters.
From the options listed below, select the suitable data sources for the flume.
Read the statement and select the correct options: distcp command ALWAYS needs fully qualified hdfs paths.
Which of following statement(s) are true about distcp command? (A)
Which of the following is NOT the component of Flume? (B)
Which of the following is the correct sequence of MapReduce flow?
Which of the following can be used to control the number of part files in a map reduce program output directory?
Which of the following operations canβt use Reducer as combiner also?
Which of the following is/are true about combiners?
Reduce side join is useful for
Distributed Cache can be used in
What is the optimal size of a file for distributed cache?
Number of mappers is decided by the
Which of the following type of joins can be performed in Reduce side join operation?
What should be an upper limit for counters of a Map Reduce job?
Which of the following class is responsible for converting inputs to key-value Pairs of Map Reduce
Which of the following writable can be used to know the value from a mapper/reducer?
A Map reduce job can be written in:
Pig is a:
Pig is good for:
Which of the following is the correct representation to access ββSkillβ from the Bag {βSkillsβ,55, (βSkillβ, βSpeedβ), {2, (βSanβ, βMateoβ)}}
Maximum size allowed for small dataset in replicated join is:
Parameters could be passed to Pig scripts from:
The schema of a relation can be examined through:
Data can be supplied to PigUnit tests from:
Which of the following constructs are valid Pig Control Structures?
Which of following is the return data type of Filter UDF?
Which of the following are not possible in Hive?
Who will initiate the mapper?
Which of the following are the Big Data Solutions Candidates?
Hadoop is a framework that allows the distributed processing of:
Which of the following are NOT metadata items?
What decides number of Mappers for a MapReduce job?
Name Node monitors block replication process
Which of the following are true for Hadoop Pseudo Distributed Mode?
Which of following statement(s) are correct?
Which of the following is true for Hive?
Which of the following is the highest level of Data Model in Hive?
Hive queries response time is in order of
Managed tables in Hive: