Download Cloudera Certified Developer for Apache Hadoop (CCDH).CCD-410.TestKing.2018-11-02.32q.tqb

Vendor: Cloudera
Exam Code: CCD-410
Exam Name: Cloudera Certified Developer for Apache Hadoop (CCDH)
Date: Nov 02, 2018
File Size: 219 KB
Downloads: 1

How to open VCEX files?

Files with VCEX extension can be opened by ProfExam Simulator.

Purchase
Coupon: EXAM_HUB

Discount: 20%

Demo Questions

Question 1
How are keys and values presented and passed to the reducers during a standard sort and shuffle phase of MapReduce? 
  1. Keys are presented to reducer in sorted order; values for a given key are not sorted.
  2. Keys are presented to reducer in sorted order; values for a given key are sorted in ascending order.
  3. Keys are presented to a reducer in random order; values for a given key are not sorted.
  4. Keys are presented to a reducer in random order; values for a given key are sorted in ascending order.
Correct answer: A
Explanation:
Reducer has 3 primary phases:1. Shuffle The Reducer copies the sorted output from each Mapper using HTTP across the network. 2. Sort The framework merge sorts Reducer inputs by keys (since different Mappers may have output the same key). The shuffle and sort phases occur simultaneously i.e. while outputs are being fetched they are merged. SecondarySort To achieve a secondary sort on the values returned by the value iterator, the application should extend the key with the secondary key and define a grouping comparator. The keys will be sorted using the entire key, but will be grouped using the grouping comparator to decide which keys and values are sent in the same call to reduce. 3. Reduce In this phase the reduce(Object, Iterable, Context) method is called for each <key, (collection of values)> in the sorted inputs. The output of the reduce task is typically written to a RecordWriter via TaskInputOutputContext.write(Object, Object). The output of the Reducer is not re-sorted. Reference: org.apache.hadoop.mapreduce, Class Reducer<KEYIN,VALUEIN,KEYOUT,VALUEOUT>
Reducer has 3 primary phases:
1. Shuffle 
The Reducer copies the sorted output from each Mapper using HTTP across the network. 
2. Sort 
The framework merge sorts Reducer inputs by keys (since different Mappers may have output the same key). 
The shuffle and sort phases occur simultaneously i.e. while outputs are being fetched they are merged. 
SecondarySort 
To achieve a secondary sort on the values returned by the value iterator, the application should extend the key with the secondary key and define a grouping comparator. The keys will be sorted using the entire key, but will be grouped using the grouping comparator to decide which keys and values are sent in the same call to reduce. 
3. Reduce 
In this phase the reduce(Object, Iterable, Context) method is called for each <key, (collection of values)> in the sorted inputs. 
The output of the reduce task is typically written to a RecordWriter via TaskInputOutputContext.write(Object, Object). 
The output of the Reducer is not re-sorted. 
Reference: org.apache.hadoop.mapreduce, Class Reducer<KEYIN,VALUEIN,KEYOUT,VALUEOUT>
Question 2
Assuming default settings, which best describes the order of data provided to a reducer’s reduce method:
  1. The keys given to a reducer aren’t in a predictable order, but the values associated with those keys always are.
  2. Both the keys and values passed to a reducer always appear in sorted order.
  3. Neither keys nor values are in any predictable order.
  4. The keys given to a reducer are in sorted order but the values associated with each key are in no predictable order
Correct answer: D
Explanation:
Reducer has 3 primary phases:1. Shuffle The Reducer copies the sorted output from each Mapper using HTTP across the network. 2. Sort The framework merge sorts Reducer inputs by keys (since different Mappers may have output the same key). The shuffle and sort phases occur simultaneously i.e. while outputs are being fetched they are merged. SecondarySort To achieve a secondary sort on the values returned by the value iterator, the application should extend the key with the secondary key and define a grouping comparator. The keys will be sorted using the entire key, but will be grouped using the grouping comparator to decide which keys and values are sent in the same call to reduce. 3. Reduce In this phase the reduce(Object, Iterable, Context) method is called for each <key, (collection of values)> in the sorted inputs. The output of the reduce task is typically written to a RecordWriter via TaskInputOutputContext.write(Object, Object). The output of the Reducer is not re-sorted. Reference: org.apache.hadoop.mapreduce, Class Reducer<KEYIN,VALUEIN,KEYOUT,VALUEOUT>
Reducer has 3 primary phases:
1. Shuffle 
The Reducer copies the sorted output from each Mapper using HTTP across the network. 
2. Sort 
The framework merge sorts Reducer inputs by keys (since different Mappers may have output the same key). 
The shuffle and sort phases occur simultaneously i.e. while outputs are being fetched they are merged. 
SecondarySort 
To achieve a secondary sort on the values returned by the value iterator, the application should extend the key with the secondary key and define a grouping comparator. The keys will be sorted using the entire key, but will be grouped using the grouping comparator to decide which keys and values are sent in the same call to reduce. 
3. Reduce 
In this phase the reduce(Object, Iterable, Context) method is called for each <key, (collection of values)> in the sorted inputs. 
The output of the reduce task is typically written to a RecordWriter via TaskInputOutputContext.write(Object, Object). 
The output of the Reducer is not re-sorted. 
Reference: org.apache.hadoop.mapreduce, Class Reducer<KEYIN,VALUEIN,KEYOUT,VALUEOUT>
Question 3
You want to populate an associative array in order to perform a map-side join. You’ve decided to put this information in a text file, place that file into the DistributedCache and read it in your Mapper before any records are processed. 
Indentify which method in the Mapper you should use to implement code for reading the file and populating the associative array?
  1. combine
  2. map
  3. init
  4. configure
Correct answer: D
Explanation:
See 3) below. Here is an illustrative example on how to use the DistributedCache:     // Setting up the cache for the application            1. Copy the requisite files to the FileSystem:           $ bin/hadoop fs -copyFromLocal lookup.dat /myapp/lookup.dat        $ bin/hadoop fs -copyFromLocal map.zip /myapp/map.zip        $ bin/hadoop fs -copyFromLocal mylib.jar /myapp/mylib.jar      $ bin/hadoop fs -copyFromLocal mytar.tar /myapp/mytar.tar      $ bin/hadoop fs -copyFromLocal mytgz.tgz /myapp/mytgz.tgz      $ bin/hadoop fs -copyFromLocal mytargz.tar.gz /myapp/mytargz.tar.gz            2. Setup the application's JobConf:           JobConf job = new JobConf();      DistributedCache.addCacheFile(new URI("/myapp/lookup.dat#lookup.dat"),                                     job);      DistributedCache.addCacheArchive(new URI("/myapp/map.zip", job);      DistributedCache.addFileToClassPath(new Path("/myapp/mylib.jar"), job);      DistributedCache.addCacheArchive(new URI("/myapp/mytar.tar", job);      DistributedCache.addCacheArchive(new URI("/myapp/mytgz.tgz", job);      DistributedCache.addCacheArchive(new URI("/myapp/mytargz.tar.gz", job);            3. Use the cached files in the Mapper      or Reducer:           public static class MapClass extends MapReduceBase        implements Mapper<K, V, K, V> {              private Path[] localArchives;        private Path[] localFiles;                public void configure(JobConf job) {          // Get the cached archives/files          localArchives = DistributedCache.getLocalCacheArchives(job);          localFiles = DistributedCache.getLocalCacheFiles(job);        }                public void map(K key, V value,                         OutputCollector<K, V> output, Reporter reporter)         throws IOException {          // Use data from the cached archives/files here          // ...          // ...          output.collect(k, v);        }      }       Reference: org.apache.hadoop.filecache , Class DistributedCache
See 3) below. 
Here is an illustrative example on how to use the DistributedCache:
     // Setting up the cache for the application 
      
     1. Copy the requisite files to the FileSystem:
      
     $ bin/hadoop fs -copyFromLocal lookup.dat /myapp/lookup.dat   
     $ bin/hadoop fs -copyFromLocal map.zip /myapp/map.zip   
     $ bin/hadoop fs -copyFromLocal mylib.jar /myapp/mylib.jar 
     $ bin/hadoop fs -copyFromLocal mytar.tar /myapp/mytar.tar 
     $ bin/hadoop fs -copyFromLocal mytgz.tgz /myapp/mytgz.tgz 
     $ bin/hadoop fs -copyFromLocal mytargz.tar.gz /myapp/mytargz.tar.gz 
      
     2. Setup the application's JobConf:
      
     JobConf job = new JobConf(); 
     DistributedCache.addCacheFile(new URI("/myapp/lookup.dat#lookup.dat"),  
                                   job); 
     DistributedCache.addCacheArchive(new URI("/myapp/map.zip", job); 
     DistributedCache.addFileToClassPath(new Path("/myapp/mylib.jar"), job); 
     DistributedCache.addCacheArchive(new URI("/myapp/mytar.tar", job); 
     DistributedCache.addCacheArchive(new URI("/myapp/mytgz.tgz", job); 
     DistributedCache.addCacheArchive(new URI("/myapp/mytargz.tar.gz", job); 
      
     3. Use the cached files in the Mapper 
     or Reducer:
      
     public static class MapClass extends MapReduceBase   
     implements Mapper<K, V, K, V> { 
      
       private Path[] localArchives; 
       private Path[] localFiles; 
        
       public void configure(JobConf job) { 
         // Get the cached archives/files 
         localArchives = DistributedCache.getLocalCacheArchives(job); 
         localFiles = DistributedCache.getLocalCacheFiles(job); 
       } 
        
       public void map(K key, V value,  
                       OutputCollector<K, V> output, Reporter reporter)  
       throws IOException { 
         // Use data from the cached archives/files here 
         // ... 
         // ... 
         output.collect(k, v); 
       } 
     } 
      
Reference: org.apache.hadoop.filecache , Class DistributedCache
HOW TO OPEN VCE FILES

Use VCE Exam Simulator to open VCE files
Avanaset

HOW TO OPEN VCEX AND EXAM FILES

Use ProfExam Simulator to open VCEX and EXAM files
ProfExam Screen

ProfExam
ProfExam at a 20% markdown

You have the opportunity to purchase ProfExam at a 20% reduced price

Get Now!