org.duracloud.services.hadoop.base
Class WholeFileInputFormat

java.lang.Object
  extended by org.apache.hadoop.mapred.FileInputFormat
      extended by org.duracloud.services.hadoop.base.WholeFileInputFormat
All Implemented Interfaces:
org.apache.hadoop.mapred.InputFormat
Direct Known Subclasses:
ICInputFormat

public class WholeFileInputFormat
extends org.apache.hadoop.mapred.FileInputFormat

Input format which defines that files are not split and uses the SimpleFileRecordReader to produce key/value pairs based only on file path.


Field Summary
 
Fields inherited from class org.apache.hadoop.mapred.FileInputFormat
LOG
 
Constructor Summary
WholeFileInputFormat()
           
 
Method Summary
 org.apache.hadoop.mapred.RecordReader getRecordReader(org.apache.hadoop.mapred.InputSplit inputSplit, org.apache.hadoop.mapred.JobConf jobConf, org.apache.hadoop.mapred.Reporter reporter)
           
protected  boolean isSplitable(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path filename)
           
protected  org.apache.hadoop.fs.FileStatus[] listStatus(org.apache.hadoop.mapred.JobConf job)
          This method overrides FileInputFormat.listStatus() in order to recursively collect the FileStatus objects from the input-path-dirs.
 
Methods inherited from class org.apache.hadoop.mapred.FileInputFormat
addInputPath, addInputPaths, computeSplitSize, getBlockIndex, getInputPathFilter, getInputPaths, getSplitHosts, getSplits, setInputPathFilter, setInputPaths, setInputPaths, setMinSplitSize
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

WholeFileInputFormat

public WholeFileInputFormat()
Method Detail

isSplitable

protected boolean isSplitable(org.apache.hadoop.fs.FileSystem fs,
                              org.apache.hadoop.fs.Path filename)
Overrides:
isSplitable in class org.apache.hadoop.mapred.FileInputFormat

getRecordReader

public org.apache.hadoop.mapred.RecordReader getRecordReader(org.apache.hadoop.mapred.InputSplit inputSplit,
                                                             org.apache.hadoop.mapred.JobConf jobConf,
                                                             org.apache.hadoop.mapred.Reporter reporter)
                                                      throws IOException
Specified by:
getRecordReader in interface org.apache.hadoop.mapred.InputFormat
Specified by:
getRecordReader in class org.apache.hadoop.mapred.FileInputFormat
Throws:
IOException

listStatus

protected org.apache.hadoop.fs.FileStatus[] listStatus(org.apache.hadoop.mapred.JobConf job)
                                                throws IOException
This method overrides FileInputFormat.listStatus() in order to recursively collect the FileStatus objects from the input-path-dirs.

Overrides:
listStatus in class org.apache.hadoop.mapred.FileInputFormat
Parameters:
job -
Returns:
Throws:
IOException


Copyright © 2009-2012 DuraSpace. All Rights Reserved.