H2O is a machine learning framework which has been implemented in Java and provides an API for Scala, Python and R. This framework has the following unique features:
– It has a interactive web GUI (Flow) so that you can work w/o any programming
– The generated trained models can be deployed easily in a production JVM environment with minimal dependencies.

In this document we demonstrate how H2O can be used with Scala & Spark (Sparkling Water) to run a ‘Distributed Random Forrest’ classification.

H2O and the related documentation for Sparkling Water can be found here

I am using Jupyter with the BeakerX Scala kernel.

The document can be fund at the following gist.


Leave a Reply

Avatar placeholder

Your email address will not be published. Required fields are marked *