I’ve been working with the Hadoop ecosystem quite a bit over the last 3 weeks. For as much documentation as exists on the topic, most of the books, online tutorials, and the like cover only the basic use cases. Many of these use cases revolve around using shell scripts to call command line tools (tiny apps with a main method that invoke other things).
By following the basic tutorials, it is easy to get started with the Hadoop ecosystem, especially if you are running MapReduce tasks on a local pseudo-distributed cluster or your app is co-located with the namenode (?) of the cluster. However, if you would like to treat the Hadoop cluster as a resource by submitting jobs from another server, the documentation on this in context of HBase is sparse. Perhaps I did not know the correct keywords.
The good news is that this use case is straight forward if you know whats going on.
Loading posts...