How many maps are there in a particular Job?

Answer

The number of maps is usually driven by the total size of the inputs, that is, the total number of blocks of the input files.
Generally it is around 10-100 maps per-node. Task setup takes awhile, so it is best if the maps take at least a minute to execute.
Suppose, if you expect 10TB of input data and have a blocksize of 128MB, you'll end up with
82,000 maps, to control the number of block you can use the mapreduce.job.maps parameter (which only provides a hint to the framework). Ultimately, the number of tasks is controlled by the number of splits returned by the InputFormat.getSplits() method (which you can override).

All hadoop Questions

Ask your interview questions on hadoop

Write Your comment or Questions if you want the answers on hadoop from hadoop Experts

Disclimer: PCDS.CO.IN not responsible for any content, information, data or any feature of website. If you are using this website then its your own responsibility to understand the content of the website

--------- Tutorials ---

*Name :**
*Email Id :**
*Mob no :**
Question Or Comment* :