Skip to content

jagrvargen/Pitchfork_Word2Vec

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

A Simple Word2Vec Model in TensorFlow

This project is a first attempt to build a skip-gram Word2Vec model in Python 3.6 and TensorFlow 1.8 trained on album reviews from the popular contemporary music review site Pitchfork. Using 18393 reviews sourced from Kaggle.com, I trained a neural network using Noise Contrastive Estimation to measure loss and optimized using the Adam Optimizer.

Getting Started

Environment

All packages were installed in the Anaconda 3.5.4 package management environment. Libraries used: TensorFlow 1.8 Natural Language ToolKit 3.3

Deployment

This model was trained using FloydHub cloud services in order to have access to a GPU. It is possible to train the model on a CPU, but the process may take several hours. In order to begin training the model, simply type the command: python3 clean_text.py. All the functionality to parse and batch the data is included in this file. Just make sure that the reviews_corpus.txt file exists in the same directory.

Author

Jesse Hedden [email protected]

Acknowlegdments

  • This project would not have been possible without constant reference to the excellent online tutorials on adventuresinmachinelearning.com. In order to produce visual output and track loss accross training iterations, I referred to their code in the repository

  • The Introduction to Machine Learning Course by Andrew Ng, as well has his excellent tutorials on RNNs

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published