{"id":565,"date":"2018-09-24T20:24:14","date_gmt":"2018-09-24T18:24:14","guid":{"rendered":"https:\/\/www.pschatzmann.ch\/home\/?p=565"},"modified":"2020-11-21T22:22:51","modified_gmt":"2020-11-21T21:22:51","slug":"deeplearningforj-recurrent-neural-networks-rnn","status":"publish","type":"post","link":"https:\/\/www.pschatzmann.ch\/home\/2018\/09\/24\/deeplearningforj-recurrent-neural-networks-rnn\/","title":{"rendered":"Deeplearning4j &#8211; Recurrent Neural Networks (RNN)"},"content":{"rendered":"<p>A recurrent neural network (RNN) is a class of artificial neural network where connections between nodes form a directed graph along a sequence. This allows it to exhibit temporal dynamic behavior for a time sequence. Unlike feedforward neural networks, RNNs can use their internal state (memory) to process sequences of inputs.<\/p>\n<p>I am showing a basic implementation of a RNN in DL4J. Further information can be found at<br \/>\n<a href=\"https:\/\/deeplearning4j.org\/docs\/latest\/deeplearning4j-nn-recurrent.\">https:\/\/deeplearning4j.org\/docs\/latest\/deeplearning4j-nn-recurrent.<\/a><br \/>\nThis demo has been implemented in Scala using Jupyter with the <a href=\"http:\/\/beakerx.com\/\">BeakerX<\/a> kernel.<\/p>\n<h3>Setup<\/h3>\n<p>We add the necessary dependencies to the classpath and we import the classes which we subsequently plan to use.<br \/>\n&#8211; deeplearning4j-core<br \/>\n&#8211; nd4j (which is used for the underlying data model)<br \/>\n&#8211; the DeeplearningforJ UI and Logback which is needed by the the UI<\/p>\n<pre><code class=\"Scala\">%%classpath add mvn \norg.nd4j:nd4j-native-platform:1.0.0-beta2\norg.deeplearning4j:deeplearning4j-core:1.0.0-beta2\norg.deeplearning4j:deeplearning4j-ui_2.11:1.0.0-beta2\nch.qos.logback:logback-classic:1.2.3\n\n<\/code><\/pre>\n<pre><code class=\"Scala\">import org.deeplearning4j.nn.conf.MultiLayerConfiguration;\nimport org.deeplearning4j.nn.conf.NeuralNetConfiguration;\nimport org.deeplearning4j.nn.conf.NeuralNetConfiguration.ListBuilder;\nimport org.deeplearning4j.nn.conf.layers.LSTM;\nimport org.deeplearning4j.nn.conf.layers.RnnOutputLayer;\nimport org.deeplearning4j.nn.conf.layers.recurrent.SimpleRnn;\nimport org.deeplearning4j.nn.multilayer.MultiLayerNetwork;\nimport org.deeplearning4j.nn.weights.WeightInit;\nimport org.deeplearning4j.optimize.listeners.ScoreIterationListener;\n\nimport org.deeplearning4j.ui.api.UIServer\nimport org.deeplearning4j.ui.storage.InMemoryStatsStorage\nimport org.deeplearning4j.ui.stats.StatsListener\n\nimport org.nd4j.linalg.activations.Activation;\nimport org.nd4j.linalg.api.ndarray.INDArray;\nimport org.nd4j.linalg.api.ops.impl.indexaccum.IMax;\nimport org.nd4j.linalg.dataset.DataSet;\nimport org.nd4j.linalg.factory.Nd4j;\nimport org.nd4j.linalg.learning.config.AMSGrad;\nimport org.nd4j.linalg.learning.config.RmsProp;\nimport org.nd4j.linalg.lossfunctions.LossFunctions.LossFunction;\n\nimport java.util.ArrayList;\nimport java.util.LinkedHashSet;\nimport java.util.List;\nimport java.util.Random;\nimport java.util.Arrays\n\n<\/code><\/pre>\n<pre><code>import org.deeplearning4j.nn.conf.MultiLayerConfiguration\nimport org.deeplearning4j.nn.conf.NeuralNetConfiguration\nimport org.deeplearning4j.nn.conf.NeuralNetConfiguration.ListBuilder\nimport org.deeplearning4j.nn.conf.layers.LSTM\nimport org.deeplearning4j.nn.conf.layers.RnnOutputLayer\nimport org.deeplearning4j.nn.conf.layers.recurrent.SimpleRnn\nimport org.deeplearning4j.nn.multilayer.MultiLayerNetwork\nimport org.deeplearning4j.nn.weights.WeightInit\nimport org.deeplearning4j.optimize.listeners.ScoreIterationListener\nimport org.deeplearning4j.ui.api.UIServer\nimport org.deeplearning4j.ui.storage.InMemoryStatsStorage\nimport org.deeplearning4j.ui.stats.StatsListener\nimport org.nd4j.linalg.activations.Activation\nimport org.nd4j.linalg.api.ndarray.INDArray\nimport org.nd4j.linalg.api.ops.impl.i...\n<\/code><\/pre>\n<h3>Data Model &#8211; Character Encoding<\/h3>\n<p>We define the sentence to learn as String. A special character is added at the beginning so the RNN learns the complete string and ends with the marker.<\/p>\n<p>We also create a dedicated List of possible chars in the charList variable<\/p>\n<pre><code class=\"Scala\">val learnString = \"*A computer will do what you tell it to do, but that may be much different from what you had in mind.\";\n\nval charList = learnString.toSet.toList\n\n<\/code><\/pre>\n<pre><code>[[e, *, n, ., y, t, u, f, A, a, m, i,  , ,, b, l, p, c, h, r, w, o, d]]\n<\/code><\/pre>\n<p>We use this character list to encode the characters in an ND4 array:<\/p>\n<pre><code class=\"Scala\">var nd4Array = Nd4j.zeros(1, charList.size, 1)\n\n<\/code><\/pre>\n<pre><code>[[0, \n  0, \n  0, \n  0, \n  0, \n  0, \n  0, \n  0, \n  0, \n  0, \n  0, \n  0, \n  0, \n  0, \n  0, \n  0, \n  0, \n  0, \n  0, \n  0, \n  0, \n  0, \n  0]]\n<\/code><\/pre>\n<p>We determine the index position of the character which we want to represent<\/p>\n<pre><code class=\"Scala\">var pos:Int = charList.indexOf('A')\n\n<\/code><\/pre>\n<pre><code>8\n<\/code><\/pre>\n<p>And then we set the corresponding index position in the array to 1<\/p>\n<pre><code class=\"Scala\">nd4Array.putScalar(pos, 1)\n<\/code><\/pre>\n<pre><code>[[0, \n  0, \n  0, \n  0, \n  0, \n  0, \n  0, \n  0, \n  1.0000, \n  0, \n  0, \n  0, \n  0, \n  0, \n  0, \n  0, \n  0, \n  0, \n  0, \n  0, \n  0, \n  0, \n  0]]\n<\/code><\/pre>\n<h3>Definition of Neural Network<\/h3>\n<p>We define the network architecture: It consists of subsequent LSTM (Long\/Short Term Memory) Layers followed by a RnnOutputLayer<\/p>\n<pre><code class=\"Scala\">import org.nd4j.linalg.learning.config._\n\n\/\/ RNN dimensions\nval HIDDEN_LAYER_WIDTH = 100;\n\n\/\/ some common parameters\nvar conf = new NeuralNetConfiguration.Builder()\n    .seed(123)\n    .biasInit(0)\n    .miniBatch(false)\n    .updater(new RmsProp(0.001))\n    .weightInit(WeightInit.XAVIER)\n    .list()\n    .layer(0, new LSTM.Builder()\n           .nIn(charList.size)\n           .nOut(HIDDEN_LAYER_WIDTH)\n           .activation(Activation.TANH)\n           .build())\n    .layer(1, new LSTM.Builder()\n           .nIn(HIDDEN_LAYER_WIDTH)\n           .nOut(HIDDEN_LAYER_WIDTH)\n           .activation(Activation.TANH)\n           .build())\n    .layer(2, new LSTM.Builder()\n           .nIn(HIDDEN_LAYER_WIDTH)\n           .nOut(HIDDEN_LAYER_WIDTH)\n           .activation(Activation.TANH)\n           .build())\n    .layer(3, new RnnOutputLayer.Builder(LossFunction.MCXENT)\n        .activation(Activation.SOFTMAX)\n        .nIn(HIDDEN_LAYER_WIDTH)\n        .nOut(charList.size)\n        .build())\n    .pretrain(false)\n    .backprop(true)\n    .build()\n\n\/\/ create network\nvar net = new MultiLayerNetwork(conf)\n\nconf\n<\/code><\/pre>\n<pre><code>{\n  \"backprop\" : true,\n  \"backpropType\" : \"Standard\",\n  \"cacheMode\" : \"NONE\",\n  \"confs\" : [ {\n    \"cacheMode\" : \"NONE\",\n    \"epochCount\" : 0,\n    \"iterationCount\" : 0,\n    \"layer\" : {\n      \"@class\" : \"org.deeplearning4j.nn.conf.layers.LSTM\",\n      \"activationFn\" : {\n        \"@class\" : \"org.nd4j.linalg.activations.impl.ActivationTanH\"\n      },\n      \"biasInit\" : 0.0,\n      \"biasUpdater\" : null,\n      \"constraints\" : null,\n      \"dist\" : null,\n      \"distRecurrent\" : null,\n      \"forgetGateBiasInit\" : 1.0,\n      \"gateActivationFn\" : {\n        \"@class\" : \"org.nd4j.linalg.activations.impl.ActivationSigmoid\"\n      },\n      \"gradientNormalization\" : \"None\",\n      \"gradientNormalizationThreshold\" : 1.0,\n      \"idropout\" : null,\n      \"iupdater\" : {\n        \"@class\" : \"org.nd4j.linalg.learning.config.RmsProp\",\n        \"epsilon\" : 1.0E-8,\n        \"learningRate\" : 0.001,\n        \"rmsDecay\" : 0.95\n      },\n      \"l1\" : 0.0,\n      \"l1Bias\" : 0.0,\n      \"l2\" : 0.0,\n      \"l2Bias\" : 0.0,\n      \"layerName\" : \"layer0\",\n      \"nin\" : 23,\n      \"nout\" : 100,\n      \"pretrain\" : false,\n      \"weightInit\" : \"XAVIER\",\n      \"weightInitRecurrent\" : null,\n      \"weightNoise\" : null\n    },\n    \"maxNumLineSearchIterations\" : 5,\n    \"miniBatch\" : false,\n    \"minimize\" : true,\n    \"optimizationAlgo\" : \"STOCHASTIC_GRADIENT_DESCENT\",\n    \"pretrain\" : false,\n    \"seed\" : 123,\n    \"stepFunction\" : null,\n    \"variables\" : [ ]\n  }, {\n    \"cacheMode\" : \"NONE\",\n    \"epochCount\" : 0,\n    \"iterationCount\" : 0,\n    \"layer\" : {\n      \"@class\" : \"org.deeplearning4j.nn.conf.layers.LSTM\",\n      \"activationFn\" : {\n        \"@class\" : \"org.nd4j.linalg.activations.impl.ActivationTanH\"\n      },\n      \"biasInit\" : 0.0,\n      \"biasUpdater\" : null,\n      \"constraints\" : null,\n      \"dist\" : null,\n      \"distRecurrent\" : null,\n      \"forgetGateBiasInit\" : 1.0,\n      \"gateActivationFn\" : {\n        \"@class\" : \"org.nd4j.linalg.activations.impl.ActivationSigmoid\"\n      },\n      \"gradientNormalization\" : \"None\",\n      \"gradientNormalizationThreshold\" : 1.0,\n      \"idropout\" : null,\n      \"iupdater\" : {\n        \"@class\" : \"org.nd4j.linalg.learning.config.RmsProp\",\n        \"epsilon\" : 1.0E-8,\n        \"learningRate\" : 0.001,\n        \"rmsDecay\" : 0.95\n      },\n      \"l1\" : 0.0,\n      \"l1Bias\" : 0.0,\n      \"l2\" : 0.0,\n      \"l2Bias\" : 0.0,\n      \"layerName\" : \"layer1\",\n      \"nin\" : 100,\n      \"nout\" : 100,\n      \"pretrain\" : false,\n      \"weightInit\" : \"XAVIER\",\n      \"weightInitRecurrent\" : null,\n      \"weightNoise\" : null\n    },\n    \"maxNumLineSearchIterations\" : 5,\n    \"miniBatch\" : false,\n    \"minimize\" : true,\n    \"optimizationAlgo\" : \"STOCHASTIC_GRADIENT_DESCENT\",\n    \"pretrain\" : false,\n    \"seed\" : 123,\n    \"stepFunction\" : null,\n    \"variables\" : [ ]\n  }, {\n    \"cacheMode\" : \"NONE\",\n    \"epochCount\" : 0,\n    \"iterationCount\" : 0,\n    \"layer\" : {\n      \"@class\" : \"org.deeplearning4j.nn.conf.layers.LSTM\",\n      \"activationFn\" : {\n        \"@class\" : \"org.nd4j.linalg.activations.impl.ActivationTanH\"\n      },\n      \"biasInit\" : 0.0,\n      \"biasUpdater\" : null,\n      \"constraints\" : null,\n      \"dist\" : null,\n      \"distRecurrent\" : null,\n      \"forgetGateBiasInit\" : 1.0,\n      \"gateActivationFn\" : {\n        \"@class\" : \"org.nd4j.linalg.activations.impl.ActivationSigmoid\"\n      },\n      \"gradientNormalization\" : \"None\",\n      \"gradientNormalizationThreshold\" : 1.0,\n      \"idropout\" : null,\n      \"iupdater\" : {\n        \"@class\" : \"org.nd4j.linalg.learning.config.RmsProp\",\n        \"epsilon\" : 1.0E-8,\n        \"learningRate\" : 0.001,\n        \"rmsDecay\" : 0.95\n      },\n      \"l1\" : 0.0,\n      \"l1Bias\" : 0.0,\n      \"l2\" : 0.0,\n      \"l2Bias\" : 0.0,\n      \"layerName\" : \"layer2\",\n      \"nin\" : 100,\n      \"nout\" : 100,\n      \"pretrain\" : false,\n      \"weightInit\" : \"XAVIER\",\n      \"weightInitRecurrent\" : null,\n      \"weightNoise\" : null\n    },\n    \"maxNumLineSearchIterations\" : 5,\n    \"miniBatch\" : false,\n    \"minimize\" : true,\n    \"optimizationAlgo\" : \"STOCHASTIC_GRADIENT_DESCENT\",\n    \"pretrain\" : false,\n    \"seed\" : 123,\n    \"stepFunction\" : null,\n    \"variables\" : [ ]\n  }, {\n    \"cacheMode\" : \"NONE\",\n    \"epochCount\" : 0,\n    \"iterationCount\" : 0,\n    \"layer\" : {\n      \"@class\" : \"org.deeplearning4j.nn.conf.layers.RnnOutputLayer\",\n      \"activationFn\" : {\n        \"@class\" : \"org.nd4j.linalg.activations.impl.ActivationSoftmax\"\n      },\n      \"biasInit\" : 0.0,\n      \"biasUpdater\" : null,\n      \"constraints\" : null,\n      \"dist\" : null,\n      \"gradientNormalization\" : \"None\",\n      \"gradientNormalizationThreshold\" : 1.0,\n      \"hasBias\" : true,\n      \"idropout\" : null,\n      \"iupdater\" : {\n        \"@class\" : \"org.nd4j.linalg.learning.config.RmsProp\",\n        \"epsilon\" : 1.0E-8,\n        \"learningRate\" : 0.001,\n        \"rmsDecay\" : 0.95\n      },\n      \"l1\" : 0.0,\n      \"l1Bias\" : 0.0,\n      \"l2\" : 0.0,\n      \"l2Bias\" : 0.0,\n      \"layerName\" : \"layer3\",\n      \"lossFn\" : {\n        \"@class\" : \"org.nd4j.linalg.lossfunctions.impl.LossMCXENT\",\n        \"softmaxClipEps\" : 1.0E-10,\n        \"configProperties\" : false,\n        \"numOutputs\" : -1\n      },\n      \"nin\" : 100,\n      \"nout\" : 23,\n      \"pretrain\" : false,\n      \"weightInit\" : \"XAVIER\",\n      \"weightNoise\" : null\n    },\n    \"maxNumLineSearchIterations\" : 5,\n    \"miniBatch\" : false,\n    \"minimize\" : true,\n    \"optimizationAlgo\" : \"STOCHASTIC_GRADIENT_DESCENT\",\n    \"pretrain\" : false,\n    \"seed\" : 123,\n    \"stepFunction\" : null,\n    \"variables\" : [ ]\n  } ],\n  \"epochCount\" : 0,\n  \"inferenceWorkspaceMode\" : \"ENABLED\",\n  \"inputPreProcessors\" : { },\n  \"iterationCount\" : 0,\n  \"pretrain\" : false,\n  \"tbpttBackLength\" : 20,\n  \"tbpttFwdLength\" : 20,\n  \"trainingWorkspaceMode\" : \"ENABLED\"\n}\n<\/code><\/pre>\n<h2>Training Data<\/h2>\n<p>We generate the training data as array of all encoded input characters with their corrsponding encoded output which is just the subsequent character. E.g for the inptut &#8216;A&#8217; we use the output (label) &#8216; &#8216;.<\/p>\n<pre><code class=\"Scala\">\/\/ create input and output arrays: SAMPLE_INDEX, INPUT_NEURON,\n\/\/ learnString\nvar input = Nd4j.zeros(1, charList.size, learnString.size);\nvar labels = Nd4j.zeros(1, charList.size, learnString.size);\n\/\/ loop through our sample-sentence\n\nfor (samplePos &lt;- 0 to learnString.size - 1) {\n    \/\/ small hack: when currentChar is the last, take the first char as\n    \/\/ nextChar - not really required. Added to this hack by adding a starter first character.\n    var currentChar = learnString(samplePos);\n    \/\/ On the last character we point back to the first character as next position \n    var nextChar = learnString((samplePos + 1) % (learnString.length));\n    \/\/ input neuron for current-char is 1 at \"samplePos\"\n    input.putScalar(Array[Int](0, charList.indexOf(currentChar), samplePos ), 1);\n    \/\/ output neuron for next-char is 1 at \"samplePos\"\n    labels.putScalar(Array[Int](0, charList.indexOf(nextChar), samplePos ), 1);\n}\n\nvar trainingData = new DataSet(input, labels);\n\n<\/code><\/pre>\n<pre><code>===========INPUT===================\n[[[         0,         0,         0  ...         0         0,         0], \n  [    1.0000,         0,         0  ...         0         0,         0], \n  [         0,         0,         0  ...    1.0000         0,         0], \n   ..., \n  [         0,         0,         0  ...         0         0,         0], \n  [         0,         0,         0  ...         0         0,         0], \n  [         0,         0,         0  ...         0    1.0000,         0]]]\n=================OUTPUT==================\n[[[         0,         0,         0  ...         0         0,         0], \n  [         0,         0,         0  ...         0         0,    1.0000], \n  [         0,         0,         0  ...         0         0,         0], \n   ..., \n  [         0,         0,         0  ...         0         0,         0], \n  [         0,         0,         0  ...         0         0,         0], \n  [         0,         0,         0  ...    1.0000         0,         0]]]\n<\/code><\/pre>\n<p>Please note that both the features and labels are stored in a 3 dimensions tensor with the following size:<\/p>\n<pre><code class=\"Scala\">println(\"Size of Dimension 0 (time series): \"+trainingData.getFeatures().size(0))\nprintln(\"Size of Dimension 1 (values per time step): \"+trainingData.getFeatures().size(1))\nprintln(\"Size of Dimension 2 (time steps): \"+trainingData.getFeatures().size(2))\n<\/code><\/pre>\n<pre><code>Size of Dimension 0 (time series): 1\nSize of Dimension 1 (values per time step): 23\nSize of Dimension 2 (time steps): 101\n\nnull\n<\/code><\/pre>\n<h3>Training UI<\/h3>\n<p>So that we can follow the leaning in the GUI we do the following:<br \/>\n&#8211; we set up the UIServer<br \/>\n&#8211; and show the UIServer in an IFrame<br \/>\n&#8211; we define the StatsListener to update the UI<br \/>\n&#8211; and finally start the learing<\/p>\n<pre><code class=\"Scala\"><br \/>\/\/Initialize the user interface backend\nvar uiServer = UIServer.getInstance();\n\n\/\/Configure where the network information (gradients, score vs. time etc) is to be stored. Here: store in memory.\nvar statsStorage = new InMemoryStatsStorage();         \/\/Alternative: new FileStatsStorage(File), for saving and loading later\n\n\/\/Attach the StatsStorage instance to the UI: this allows the contents of the StatsStorage to be visualized\nuiServer.attach(statsStorage)\n\n\"The server is available at http:\/\"+uiServer.getAddress()\n<\/code><\/pre>\n<pre><code>The server is available at http:\/\/0.0.0.0:9000\n<\/code><\/pre>\n<p><img decoding=\"async\" src=\"https:\/\/www.pschatzmann.ch\/wp-content\/uploads\/2018\/09\/TrainingUI.png\" \/><\/p>\n<p>The score is the output of the loss function. The goal of the learning is to minimize the loss: so a smaller number is better as a bigger. We are looking for a situation the Model Score vs Iteration shows a steadily decreasing falling line.<\/p>\n<pre><code class=\"Scala\">net.init();\n\/\/ add the StatsListener to collect this information to display in the UI\nnet.setListeners(new StatsListener(statsStorage))\n\nval epochs = 1000\nfor (epoch &lt;- 0 to epochs) {\n    \/\/ train the data\n    net.fit(trainingData);\n}\n\n\"Training done!\"\n<\/code><\/pre>\n<pre><code>Training done!\n<\/code><\/pre>\n<h3>Displaying the Result<\/h3>\n<p>The result can be determined with the help of the rnnTimeStep method. For the first call we pass the encoded start character * and get a matrix as result.<\/p>\n<pre><code class=\"Scala\">\/\/ clear current stance from the last example\nnet.rnnClearPreviousState();\n\n\/\/ put the first character into the rrn as an initialisation\nvar testInit = Nd4j.zeros(1, charList.size, 1)\nvar pos:Int = charList.indexOf(learnString(0))\ntestInit.putScalar(pos, 1)\n\n\/\/ run one step -&gt; IMPORTANT: rnnTimeStep() must be called, not\n\/\/ output(). The output shows what the net thinks what should come next\nvar output = net.rnnTimeStep(testInit);\n\n\n<\/code><\/pre>\n<pre><code>[[0.0008, \n  0.0013, \n  0.0007, \n  0.0005, \n  0.0012, \n  0.0029, \n  0.0010, \n  0.0007, \n  0.8118, \n  0.0011, \n  0.0022, \n  0.0007, \n  0.1530, \n  0.0009, \n  0.0007, \n  0.0013, \n  0.0004, \n  0.0122, \n  0.0012, \n  0.0007, \n  0.0015, \n  0.0023, \n  0.0010]]\n<\/code><\/pre>\n<p>The resulting character can be determined from the maximum value in the array. Here it is the 9.th position with the value of 0.8118. If we check in our charList &#8211; the 9.th position contains the character &#8216;A&#8217; !<\/p>\n<p>The result matrix is then the input for the next call&#8230;.<\/p>\n<pre><code class=\"Scala\">\/\/ now the net should guess LEARNSTRING.length more characters\nvar result = \"\"\nfor (char &lt;- learnString) {\n    \/\/ first process the last output of the network to a concrete\n    \/\/ neuron, the neuron with the highest output has the highest\n    \/\/ chance to get chosen\n    var sampledCharacterIdx = Nd4j.getExecutioner().exec(new IMax(output), 1).getInt(0);\n\n    \/\/ concatenate the chosen output\n    result += charList(sampledCharacterIdx);\n\n    \/\/ use the last output as input\n    var nextInput = Nd4j.zeros(1, charList.size, 1);\n    nextInput.putScalar(sampledCharacterIdx, 1);\n    output = net.rnnTimeStep(nextInput);\n}\n\nresult\n<\/code><\/pre>\n<pre><code>A computer will do what you tell it to do, but that may be much different from what you had in mind.*\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>A recurrent neural network (RNN) is a class of artificial neural network where connections between nodes form a directed graph along a sequence. This allows it to exhibit temporal dynamic behavior for a time sequence. Unlike feedforward neural networks, RNNs can use their internal state (memory) to process sequences of [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":567,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_crdt_document":"","_import_markdown_pro_load_document_selector":0,"_import_markdown_pro_submit_text_textarea":"","_exactmetrics_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0,"footnotes":""},"categories":[14],"tags":[],"class_list":["post-565","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-machine-learning"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.5 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Deeplearning4j - Recurrent Neural Networks (RNN) - Phil Schatzmann<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.pschatzmann.ch\/home\/2018\/09\/24\/deeplearningforj-recurrent-neural-networks-rnn\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Deeplearning4j - Recurrent Neural Networks (RNN) - Phil Schatzmann\" \/>\n<meta property=\"og:description\" content=\"A recurrent neural network (RNN) is a class of artificial neural network where connections between nodes form a directed graph along a sequence. This allows it to exhibit temporal dynamic behavior for a time sequence. Unlike feedforward neural networks, RNNs can use their internal state (memory) to process sequences of [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.pschatzmann.ch\/home\/2018\/09\/24\/deeplearningforj-recurrent-neural-networks-rnn\/\" \/>\n<meta property=\"og:site_name\" content=\"Phil Schatzmann\" \/>\n<meta property=\"article:published_time\" content=\"2018-09-24T18:24:14+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2020-11-21T21:22:51+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.pschatzmann.ch\/wp-content\/uploads\/2018\/09\/score.png\" \/>\n\t<meta property=\"og:image:width\" content=\"544\" \/>\n\t<meta property=\"og:image:height\" content=\"389\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"pschatzmann\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"pschatzmann\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"9 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/www.pschatzmann.ch\\\/home\\\/2018\\\/09\\\/24\\\/deeplearningforj-recurrent-neural-networks-rnn\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.pschatzmann.ch\\\/home\\\/2018\\\/09\\\/24\\\/deeplearningforj-recurrent-neural-networks-rnn\\\/\"},\"author\":{\"name\":\"pschatzmann\",\"@id\":\"https:\\\/\\\/www.pschatzmann.ch\\\/home\\\/#\\\/schema\\\/person\\\/73a53638a4e34e8373405fd737dac9b1\"},\"headline\":\"Deeplearning4j &#8211; Recurrent Neural Networks (RNN)\",\"datePublished\":\"2018-09-24T18:24:14+00:00\",\"dateModified\":\"2020-11-21T21:22:51+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/www.pschatzmann.ch\\\/home\\\/2018\\\/09\\\/24\\\/deeplearningforj-recurrent-neural-networks-rnn\\\/\"},\"wordCount\":474,\"commentCount\":1,\"publisher\":{\"@id\":\"https:\\\/\\\/www.pschatzmann.ch\\\/home\\\/#\\\/schema\\\/person\\\/73a53638a4e34e8373405fd737dac9b1\"},\"image\":{\"@id\":\"https:\\\/\\\/www.pschatzmann.ch\\\/home\\\/2018\\\/09\\\/24\\\/deeplearningforj-recurrent-neural-networks-rnn\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.pschatzmann.ch\\\/wp-content\\\/uploads\\\/2018\\\/09\\\/score.png\",\"articleSection\":[\"Machine Learning\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/www.pschatzmann.ch\\\/home\\\/2018\\\/09\\\/24\\\/deeplearningforj-recurrent-neural-networks-rnn\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/www.pschatzmann.ch\\\/home\\\/2018\\\/09\\\/24\\\/deeplearningforj-recurrent-neural-networks-rnn\\\/\",\"url\":\"https:\\\/\\\/www.pschatzmann.ch\\\/home\\\/2018\\\/09\\\/24\\\/deeplearningforj-recurrent-neural-networks-rnn\\\/\",\"name\":\"Deeplearning4j - Recurrent Neural Networks (RNN) - Phil Schatzmann\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.pschatzmann.ch\\\/home\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/www.pschatzmann.ch\\\/home\\\/2018\\\/09\\\/24\\\/deeplearningforj-recurrent-neural-networks-rnn\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/www.pschatzmann.ch\\\/home\\\/2018\\\/09\\\/24\\\/deeplearningforj-recurrent-neural-networks-rnn\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.pschatzmann.ch\\\/wp-content\\\/uploads\\\/2018\\\/09\\\/score.png\",\"datePublished\":\"2018-09-24T18:24:14+00:00\",\"dateModified\":\"2020-11-21T21:22:51+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/www.pschatzmann.ch\\\/home\\\/2018\\\/09\\\/24\\\/deeplearningforj-recurrent-neural-networks-rnn\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/www.pschatzmann.ch\\\/home\\\/2018\\\/09\\\/24\\\/deeplearningforj-recurrent-neural-networks-rnn\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.pschatzmann.ch\\\/home\\\/2018\\\/09\\\/24\\\/deeplearningforj-recurrent-neural-networks-rnn\\\/#primaryimage\",\"url\":\"https:\\\/\\\/www.pschatzmann.ch\\\/wp-content\\\/uploads\\\/2018\\\/09\\\/score.png\",\"contentUrl\":\"https:\\\/\\\/www.pschatzmann.ch\\\/wp-content\\\/uploads\\\/2018\\\/09\\\/score.png\",\"width\":544,\"height\":389},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/www.pschatzmann.ch\\\/home\\\/2018\\\/09\\\/24\\\/deeplearningforj-recurrent-neural-networks-rnn\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/www.pschatzmann.ch\\\/home\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Deeplearning4j &#8211; Recurrent Neural Networks (RNN)\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/www.pschatzmann.ch\\\/home\\\/#website\",\"url\":\"https:\\\/\\\/www.pschatzmann.ch\\\/home\\\/\",\"name\":\"Phil Schatzmann Consulting\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\\\/\\\/www.pschatzmann.ch\\\/home\\\/#\\\/schema\\\/person\\\/73a53638a4e34e8373405fd737dac9b1\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/www.pschatzmann.ch\\\/home\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\\\/\\\/www.pschatzmann.ch\\\/home\\\/#\\\/schema\\\/person\\\/73a53638a4e34e8373405fd737dac9b1\",\"name\":\"pschatzmann\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.pschatzmann.ch\\\/wp-content\\\/uploads\\\/2022\\\/08\\\/pschatzmann.png\",\"url\":\"https:\\\/\\\/www.pschatzmann.ch\\\/wp-content\\\/uploads\\\/2022\\\/08\\\/pschatzmann.png\",\"contentUrl\":\"https:\\\/\\\/www.pschatzmann.ch\\\/wp-content\\\/uploads\\\/2022\\\/08\\\/pschatzmann.png\",\"width\":305,\"height\":305,\"caption\":\"pschatzmann\"},\"logo\":{\"@id\":\"https:\\\/\\\/www.pschatzmann.ch\\\/wp-content\\\/uploads\\\/2022\\\/08\\\/pschatzmann.png\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Deeplearning4j - Recurrent Neural Networks (RNN) - Phil Schatzmann","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.pschatzmann.ch\/home\/2018\/09\/24\/deeplearningforj-recurrent-neural-networks-rnn\/","og_locale":"en_US","og_type":"article","og_title":"Deeplearning4j - Recurrent Neural Networks (RNN) - Phil Schatzmann","og_description":"A recurrent neural network (RNN) is a class of artificial neural network where connections between nodes form a directed graph along a sequence. This allows it to exhibit temporal dynamic behavior for a time sequence. Unlike feedforward neural networks, RNNs can use their internal state (memory) to process sequences of [&hellip;]","og_url":"https:\/\/www.pschatzmann.ch\/home\/2018\/09\/24\/deeplearningforj-recurrent-neural-networks-rnn\/","og_site_name":"Phil Schatzmann","article_published_time":"2018-09-24T18:24:14+00:00","article_modified_time":"2020-11-21T21:22:51+00:00","og_image":[{"width":544,"height":389,"url":"https:\/\/www.pschatzmann.ch\/wp-content\/uploads\/2018\/09\/score.png","type":"image\/png"}],"author":"pschatzmann","twitter_card":"summary_large_image","twitter_misc":{"Written by":"pschatzmann","Est. reading time":"9 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.pschatzmann.ch\/home\/2018\/09\/24\/deeplearningforj-recurrent-neural-networks-rnn\/#article","isPartOf":{"@id":"https:\/\/www.pschatzmann.ch\/home\/2018\/09\/24\/deeplearningforj-recurrent-neural-networks-rnn\/"},"author":{"name":"pschatzmann","@id":"https:\/\/www.pschatzmann.ch\/home\/#\/schema\/person\/73a53638a4e34e8373405fd737dac9b1"},"headline":"Deeplearning4j &#8211; Recurrent Neural Networks (RNN)","datePublished":"2018-09-24T18:24:14+00:00","dateModified":"2020-11-21T21:22:51+00:00","mainEntityOfPage":{"@id":"https:\/\/www.pschatzmann.ch\/home\/2018\/09\/24\/deeplearningforj-recurrent-neural-networks-rnn\/"},"wordCount":474,"commentCount":1,"publisher":{"@id":"https:\/\/www.pschatzmann.ch\/home\/#\/schema\/person\/73a53638a4e34e8373405fd737dac9b1"},"image":{"@id":"https:\/\/www.pschatzmann.ch\/home\/2018\/09\/24\/deeplearningforj-recurrent-neural-networks-rnn\/#primaryimage"},"thumbnailUrl":"https:\/\/www.pschatzmann.ch\/wp-content\/uploads\/2018\/09\/score.png","articleSection":["Machine Learning"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.pschatzmann.ch\/home\/2018\/09\/24\/deeplearningforj-recurrent-neural-networks-rnn\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.pschatzmann.ch\/home\/2018\/09\/24\/deeplearningforj-recurrent-neural-networks-rnn\/","url":"https:\/\/www.pschatzmann.ch\/home\/2018\/09\/24\/deeplearningforj-recurrent-neural-networks-rnn\/","name":"Deeplearning4j - Recurrent Neural Networks (RNN) - Phil Schatzmann","isPartOf":{"@id":"https:\/\/www.pschatzmann.ch\/home\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.pschatzmann.ch\/home\/2018\/09\/24\/deeplearningforj-recurrent-neural-networks-rnn\/#primaryimage"},"image":{"@id":"https:\/\/www.pschatzmann.ch\/home\/2018\/09\/24\/deeplearningforj-recurrent-neural-networks-rnn\/#primaryimage"},"thumbnailUrl":"https:\/\/www.pschatzmann.ch\/wp-content\/uploads\/2018\/09\/score.png","datePublished":"2018-09-24T18:24:14+00:00","dateModified":"2020-11-21T21:22:51+00:00","breadcrumb":{"@id":"https:\/\/www.pschatzmann.ch\/home\/2018\/09\/24\/deeplearningforj-recurrent-neural-networks-rnn\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.pschatzmann.ch\/home\/2018\/09\/24\/deeplearningforj-recurrent-neural-networks-rnn\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.pschatzmann.ch\/home\/2018\/09\/24\/deeplearningforj-recurrent-neural-networks-rnn\/#primaryimage","url":"https:\/\/www.pschatzmann.ch\/wp-content\/uploads\/2018\/09\/score.png","contentUrl":"https:\/\/www.pschatzmann.ch\/wp-content\/uploads\/2018\/09\/score.png","width":544,"height":389},{"@type":"BreadcrumbList","@id":"https:\/\/www.pschatzmann.ch\/home\/2018\/09\/24\/deeplearningforj-recurrent-neural-networks-rnn\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.pschatzmann.ch\/home\/"},{"@type":"ListItem","position":2,"name":"Deeplearning4j &#8211; Recurrent Neural Networks (RNN)"}]},{"@type":"WebSite","@id":"https:\/\/www.pschatzmann.ch\/home\/#website","url":"https:\/\/www.pschatzmann.ch\/home\/","name":"Phil Schatzmann Consulting","description":"","publisher":{"@id":"https:\/\/www.pschatzmann.ch\/home\/#\/schema\/person\/73a53638a4e34e8373405fd737dac9b1"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.pschatzmann.ch\/home\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"https:\/\/www.pschatzmann.ch\/home\/#\/schema\/person\/73a53638a4e34e8373405fd737dac9b1","name":"pschatzmann","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.pschatzmann.ch\/wp-content\/uploads\/2022\/08\/pschatzmann.png","url":"https:\/\/www.pschatzmann.ch\/wp-content\/uploads\/2022\/08\/pschatzmann.png","contentUrl":"https:\/\/www.pschatzmann.ch\/wp-content\/uploads\/2022\/08\/pschatzmann.png","width":305,"height":305,"caption":"pschatzmann"},"logo":{"@id":"https:\/\/www.pschatzmann.ch\/wp-content\/uploads\/2022\/08\/pschatzmann.png"}}]}},"post_mailing_queue_ids":[],"_links":{"self":[{"href":"https:\/\/www.pschatzmann.ch\/home\/wp-json\/wp\/v2\/posts\/565","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.pschatzmann.ch\/home\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.pschatzmann.ch\/home\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.pschatzmann.ch\/home\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.pschatzmann.ch\/home\/wp-json\/wp\/v2\/comments?post=565"}],"version-history":[{"count":1,"href":"https:\/\/www.pschatzmann.ch\/home\/wp-json\/wp\/v2\/posts\/565\/revisions"}],"predecessor-version":[{"id":2221,"href":"https:\/\/www.pschatzmann.ch\/home\/wp-json\/wp\/v2\/posts\/565\/revisions\/2221"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.pschatzmann.ch\/home\/wp-json\/wp\/v2\/media\/567"}],"wp:attachment":[{"href":"https:\/\/www.pschatzmann.ch\/home\/wp-json\/wp\/v2\/media?parent=565"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.pschatzmann.ch\/home\/wp-json\/wp\/v2\/categories?post=565"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.pschatzmann.ch\/home\/wp-json\/wp\/v2\/tags?post=565"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}