Spring Batch for Hadoop workflows

In a post earlier, we had introduced Spring for Hadoop. The framework has since then made significant progress and now has multiple components.

Using Hadoop alongside Spring Hadoop we can now support scenarios such as:
  • Managing batches of data or running batch processes like calculations or formatting with Spring Batch and loading these on or off Hadoop workflows.

  • Building integration patterns via Spring Integration that can check a directory or FTP folder for new information, trigger a workflow, send an email, invoke an AMQP message, write a file, continuously query Pivotal GemFire, poll Twitter, and more.

  • Using Spring Data to interact with data from Redis, MongoDB, Neo4j, Pivotal GemFire, any JDBC oriented database, Couchbase, FuzzyDB, Elasticsearch, or Solr and push it into or from Hadoop.

  • Having a user interface or some other business logic start a MapReduce job or move data into HDFS as part of a general Spring Framework interaction.
source: http://blog.gopivotal.com/products/programming-with-hadoop-101-getting-started-with-spring-hadoop