Skip to content

chnsh/deep-semantic-code-search

Repository files navigation

Deep Semantic Code Search

Code for Paper: Paper

Deep Semantic Code Search aims to explore a joint embedding space for code and description vectors and then use it for a code search application.

In these experiments, there are 2 parts:

  1. The first one uses an approach suggested in [1] and we train their architecture on our own python dataset.
  2. The second approach expands on the first one through methodology suggested in [2] and we achieve reasonably good results.

We can observe that some sort of semantic information is captured the results:

Query Results

Instructions on reproducing our results

Implementation of [1] is within Joint Training Model and [2] is within Code Summarization Transfer Learning

Dataset

For [1], our dataset is provided within Joint Training Model. For [2], the full dataset is available on Google Cloud Platform.

For how to access data on GCP, please follow this link https://cloud.google.com/storage/docs/access-public-data

References:

[1] https://guxd.github.io/papers/deepcs.pdf

[2] https://towardsdatascience.com/semantic-code-search-3cd6d244a39c

About

Deep Semantic Code Search aims to explore a joint embedding space for code and description vectors and then use it for a code search application

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published