open-source - http://archive.pkmital.com https://archive.pkmital.com computational audiovisual augmented reality research Fri, 06 Feb 2015 05:45:51 +0000 en-US hourly 1 https://wordpress.org/?v=6.9.4 Handwriting Recognition with LSTMs and ofxCaffe https://archive.pkmital.com/2015/02/06/handwriting-recognition-with-lstms-and-ofxcaffe/ https://archive.pkmital.com/2015/02/06/handwriting-recognition-with-lstms-and-ofxcaffe/#respond Fri, 06 Feb 2015 04:41:49 +0000 http://pkmital.com/home/?p=1836

Long Short Term Memory (LSTM) is a Recurrent Neural Network (RNN) architecture designed to better model temporal sequences (e.g. audio, sentences, video) and long range dependencies than conventional RNNs [1]. There is a lot of excitement in the machine learning communities with LSTMs (and Deep Minds’s counterpart, “Neural Turing Machines” [2], or Facebook’s, “Memory Networks” [3]) as they overcome a fundamental limitation to conventional RNNs and are able to achieve state-of-the-art benchmark performances on a number of tasks [4,5]:

  • Text-to-speech synthesis (Fan et al., Microsoft, Interspeech 2014)
  • Language identification (Gonzalez-Dominguez et al., Google, Interspeech 2014)
  • Large vocabulary speech recognition (Sak et al., Google, Interspeech 2014)
  • Prosody contour prediction (Fernandez et al., IBM, Interspeech 2014)
  • Medium vocabulary speech recognition (Geiger et al., Interspeech 2014)
  • English to French translation (Sutskever et al., Google, NIPS 2014)
  • Audio onset detection (Marchi et al., ICASSP 2014)
  • Social signal classification (Brueckner & Schulter, ICASSP 2014)
  • Arabic handwriting recognition (Bluche et al., DAS 2014)
  • TIMIT phoneme recognition (Graves et al., ICASSP 2013)
  • Optical character recognition (Breuel et al., ICDAR 2013)
  • Image caption generation (Vinyals et al., Google, 2014)
  • Video to textual description (Donahue et al., 2014)

The current dynamic state … Continue reading...

The post Handwriting Recognition with LSTMs and ofxCaffe first appeared on http://archive.pkmital.com.

]]>
https://archive.pkmital.com/2015/02/06/handwriting-recognition-with-lstms-and-ofxcaffe/feed/ 0
Tim J Smith guest blogs for David Bordwell https://archive.pkmital.com/2011/02/20/tim-j-smith-guest-blogs-for-david-bordwell/ https://archive.pkmital.com/2011/02/20/tim-j-smith-guest-blogs-for-david-bordwell/#respond Sun, 20 Feb 2011 03:43:36 +0000 http://pkmital.com/home/?p=540 Tim J Smith, expert in scene perception and film cognition, and of The DIEM project [1] recently starred as a guest blogger for David Bordwell, a leading film theorist with an impressive list of books and publications widely used in film cognition/film art research/studies [2]. In his article featured on David’s site, Tim expands on his research on film cognition including continuity editing [3], attentional synchrony [4], and the project we worked on in 2008-2010 as part of The DIEM Project. Since Tim’s feature on David Bordwell’s blog, The DIEM Project saw a surge of publicity and our vimeo video loads going higher than 200,000 in a single day and features on dvice, slashfilm, gizmodo, Rogert Ebert’s facebook/twitter, and the front page of imbd.com.

Not to mention, our tools and visualizations are finally reaching an audience with interests in film, photography, and cognition. If you haven’t yet seen some of our videos, please head on over to our vimeo page, where you can see a range of videos embedded with eye-tracking of participants and many different visualizations of models of eye-movements using machine learning, or start by reading Tim’s post on Continue reading...

The post Tim J Smith guest blogs for David Bordwell first appeared on http://archive.pkmital.com.

]]>
https://archive.pkmital.com/2011/02/20/tim-j-smith-guest-blogs-for-david-bordwell/feed/ 0