Single-node Accumulo development cluster in one command…

I’m really pleased to share accumulo-vagrant which uses Vagrant to create a single-node Accumulo cluster for development.

Nearly every distributed database seems to have a dependency on Zookeeper or HDFS these days and while developing with them, I often use the convenient HDP2.x sandboxes from Hortonworks. These VirtualBox images gives me the latest Hadoop components playing nicely together, but having all of them like HBase, Hive, Storm at once is heavy when all you need is HDFS and Zookeeper as is the case with Accumulo. If you want to go even lighter weight for development, and it fits your use cases, you can check out Mock Accumulo or Mini Accumulo cluster.

I enjoyed putting together this Vagrant script which uses an excellent and relatively new feature of Ambari, Blueprints, which opens up Ambari/Hadoop cluster configuration and deployment via a RESTful, JSON API. Once all the bits are in place and Zookeeper and HDFS are running, the Vagrant script runs ‘accumulo init’ and your cluster is ready for development! When creating this VagrantĀ  script, I referenced ambari-vagrant by the generous folks at SequenceIQ who publish a lot of great open source stuff around Hadoop deployment automation. I hope you find this git repo helpful and I look forward to your comments and suggestions.