gravatar

sarveshsg

Sarvesh S G

Recently Published

Next Word Prediction Using R/Shiny
Word prediction presentation
Exploratory Analysis
We will be analysing English data from a corpus called HC Corpora (www.corpora.heliohost.org) more about the corpus can be found at http://www.corpora.heliohost.org/aboutcorpus.html The corpus contains data collected in contexts such as from blog, twitter and news feed. Our goal will be analyse this data to get insights about the vocabulary used to eventually build an application to predict the next word as and when a user types. We shall begin with cleaning the 3 subsets. We shall perform basic clean up operation like removing HTML tags, removing ascii characters and HTML whitespace character &nbsp. We will perform operation separately and plot a wordcloud to get a feel of most frequently used words
Capstone assignment 1