https://github.com/a-imantha/average-calculation-map-reduce

Calculating Average of a list of numbers with a map-reduce approach on hadoop.

https://github.com/a-imantha/average-calculation-map-reduce

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (6.6%) to scientific vocabulary

Keywords

average combiner hadoop hadoop-mapreduce map mapper mapreduce mean reduce reducer
Last synced: 5 months ago · JSON representation

Repository

Calculating Average of a list of numbers with a map-reduce approach on hadoop.

Basic Info
  • Host: GitHub
  • Owner: a-Imantha
  • Language: Java
  • Default Branch: master
  • Homepage:
  • Size: 6.84 KB
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Topics
average combiner hadoop hadoop-mapreduce map mapper mapreduce mean reduce reducer
Created about 5 years ago · Last pushed about 5 years ago
Metadata Files
Readme

README.md

Average Calculation Problem with Mapreduce Approach

This is an example program to calculate the average of a list of numbers using Mapreduce inside hadoop framework.

NOTES:

Program should run on a hadoop cluster and the configurations are set for hadoop 2.10 in the pom file. Can modify that to relevant hadoop version.

This should be packaged to a runnable jar and run against the following arguments, - Input File Location - Output Folder Location - Maximum No of Mapper classes you expect to split the problem into.(optional, default = 10)

The input file is a list of numbers inside a text file(UTF8) a number per line.Numbers can be either int or double.

Development Approach

Code includes a Mapper, Combiner and a Reducer. Mapper split the list of numbers to a maximum of given number of classes(default 10), and handover to combiner. Combiner collapse the classes it recieve to a single key called 'Average'. Then These 'Average' keys are reduced with the Reducer to print the final output.

Owner

  • Name: Imantha Ahangama
  • Login: a-Imantha
  • Kind: user
  • Location: Sri Lanka

GitHub Events

Total
Last Year

Dependencies

averageprob/pom.xml maven
  • org.apache.hadoop:hadoop-client 2.10.1
  • junit:junit 3.8.1 test