Code Generation using LSTM (Long Short-term memory) RNN network

Meena Vyas
3 min readNov 2, 2018

--

A recurrent neural network (RNN) is a class of neural network that performs well when the input/output is a sequence. RNNs can use their internal state/memory to process sequences of inputs.

Neural Network models are of various kinds

  • One to one: Image classification where we give an input image and it returns a class to which the image belongs to.
  • One to Many: Image Captioning where input is a picture and output is a sentence describing the picture.
  • Many to One: Sentimental Analysis where input is a tweet and the output is a class like positive or negative.
  • Many to Many: Sequence to sequence model with Encoder — Decoder architecture: Language translation model where input is a sentence (let’s say in English) and output is a sentence in another language (let’s say French).

There are two popular variants of RNNs

We should try both to see which one is performing better for the problem we are trying to solve.

In this blog I have tried to generate new source code using LSTM. Here are the steps

Import required packages

Then set EPOCH and Batch size. These should be tuned properly.

In preprocessing stage, I have downloaded Openssl source code from github and concatenated all .c files into a file called “train.txt”. I was getting out of memory so I just took 1/3rdOpenssl files. We can improve this code to load the source code in batches. We have to create a vocab list in preprocessing stage and saving it into a file and reading the file.

I have used character based model. We can made word based model also. We can use word embedding layer also which will be needed when we have more difficult problem sets.

I have used 2 LSTM layers with Dropout of 0.2 each and a Dense in the end with softmax. We can try different models and compare.

Visualize the model as shown below

Training for 10 epochs. As you can see loss is coming down gradually in every epoch from 2.97 to 1.55.

Here is the output it generated. We have given it a random starting point

As you can see it has done a very good job. It has returned values from a function based on if condition and start another function.

Here is the code in github. Please try it out and see.

References

Originally published at meenavyas.wordpress.com on November 2, 2018.

--

--

No responses yet