paolo@bimodesign.com | +34 608 61 64 10

NoSQL

        

HBase/Java error reduce flushing

During the excuting a map and reduce of Hadoop Java job the reduce felt down, returning this error_get_last

Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#4

As mentioned here https://issues.apache.org/jira/browse/MAPREDUCE-6447 the error is a bug and the workaround was to decrease the value of "mapreduce.reduce.shuffle.input.buffer.percent". As we can see in the oficial documentation https://hadoop.apache.org/docs/r1.0.4/mapred-default.html this percent value is 0.70 and we decided to decrease to 0.50
To set this value into the java reduce code, I modified adding this configuration set

conf.set("mapred.job.shuffle.input.buffer.percent", "0.50");

Note: this is the console output of MapReduce. Here some explication about these data http://stackoverflow.com/a/27186925

File System Counters
                FILE: Number of bytes read=486390889222
                FILE: Number of bytes written=687468686946
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=198283602026 -----> 184 GB 
                HDFS: Number of bytes written=207799910639 --> 193 GB
                HDFS: Number of read operations=4488
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=41
        Job Counters
                Killed map tasks=3
                Killed reduce tasks=1
                Launched map tasks=1479
                Launched reduce tasks=21
                Data-local map tasks=1476
                Rack-local map tasks=3
                Total time spent by all maps in occupied slots (ms)=83317707
                Total time spent by all reduces in occupied slots (ms)=104222174
                Total time spent by all map tasks (ms)=83317707
                Total time spent by all reduce tasks (ms)=52111087
                Total vcore-seconds taken by all map tasks=83317707
                Total vcore-seconds taken by all reduce tasks=52111087
                Total megabyte-seconds taken by all map tasks=42658665984
                Total megabyte-seconds taken by all reduce tasks=53361753088
        Map-Reduce Framework
                Map input records=142741637
                Map output records=142741637
                Map output bytes=200040264798
                Map output materialized bytes=200896701993 
                Input split bytes=202212
                Combine input records=0
                Combine output records=0
                Reduce input groups=1
                Reduce shuffle bytes=200896701993 --> 187 GB
                Reduce input records=142741637
                Reduce output records=142741637
                Spilled Records=419903970
                Shuffled Maps =29520
                Failed Shuffles=0
                Merged Map outputs=29520
                GC time elapsed (ms)=7061006
                CPU time spent (ms)=55573260
                Physical memory (bytes) snapshot=556034654208
                Virtual memory (bytes) snapshot=3712997023744
                Total committed heap usage (bytes)=512887357440 --> 477 GB
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        TREE COUNTERS
                NodesCommited=1
                NodesCreated=2
                PendingVectors=142741637
        File Input Format Counters
                Bytes Read=198283399814
        File Output Format Counters
                Bytes Written=207799909871