We made some pretty good progress. Our Wiki page is up (http://urban.cens.ucla.edu/cs219/index.php?title=SocialMirror) which is nice. Also, we got Orange to classify audio clips into speech and non-speech with a >90% accuracy. Huge success!
Next, we need to write a script to be able to take an audio clip (from our cell phone application) and process that. We still need more training data though so hopefully we can get all those little things taken care of before the weekend.
Then it's onto gathering data and analyzing all the characteristics we want to find!
In the meantime...sleep is definitely something we all need.
Tuesday, May 20, 2008
Monday, May 19, 2008
It's been awhile since any updates have been posted so here's a little about what we've been up to.
-The proposal for Social Mirror was completed and uploaded to the main wiki page.
-Feature extraction works for an audio clip. That is, you provide a clip as input and it outputs all the different features of it (Mel-frequency cepstral coefficients, zero-crossing rate, short time energy, sub-band energy distribution, brightness, bandwidth, spectrum flux, band periodicity and noise frame ratio)
-We're currently working on Orange and being able to differentiate between speech and non speech clips
-Soon to be up: Project wiki page
Orange is harder than we initially thought it was. We're trying to figure out how to work probabilities but so far, no luck. Hopefully today we will get more work done as we're meeting at 5 PM (or at least 2 of us are).
-The proposal for Social Mirror was completed and uploaded to the main wiki page.
-Feature extraction works for an audio clip. That is, you provide a clip as input and it outputs all the different features of it (Mel-frequency cepstral coefficients, zero-crossing rate, short time energy, sub-band energy distribution, brightness, bandwidth, spectrum flux, band periodicity and noise frame ratio)
-We're currently working on Orange and being able to differentiate between speech and non speech clips
-Soon to be up: Project wiki page
Orange is harder than we initially thought it was. We're trying to figure out how to work probabilities but so far, no luck. Hopefully today we will get more work done as we're meeting at 5 PM (or at least 2 of us are).
Wednesday, May 14, 2008
Tuesday, May 6, 2008
We found out that our last idea, Storification, has already been done by both Microsoft and Nokia research. SenseCam does everything we wanted to do and more. So, we changed our idea to focus on the audio aspect of memory aid. Hence, the new project summary below.
In our project proposal, I put a tentative timeline that we can use to get through this project. Look for it on the left - we'll need to split up the work so we do this efficiently.
First step: Test the speech recognition software James sent in the e-mail:
Link to all speech recognition software: http://en.wikipedia.org/wiki/List_of_speech_recognition_software
CMU Sphinx — James
HTK — copyrighted by Microsoft, but altering the software for the Licensee's internal use is allowed. - J
ISIP Toolkit - James
Julius — BSD-style license too old, japanese
Simon — GPL Licence - J
VoxForge — open source, GPL _ KD
And I'll send an e-mail to the Nokia Research lecturer, Péter Pál Boda, once I find his e-mail...
One more thing - I changed the URL of our blog. Too bad talksense.blogger.com was already taken. Any other suggestions? Leave them in the comments. =)
In our project proposal, I put a tentative timeline that we can use to get through this project. Look for it on the left - we'll need to split up the work so we do this efficiently.
First step: Test the speech recognition software James sent in the e-mail:
Link to all speech recognition software: http://en.wikipedia.org/wiki/List_of_speech_recognition_software
CMU Sphinx
HTK
ISIP Toolkit - James
Julius
Simon
VoxForge
And I'll send an e-mail to the Nokia Research lecturer, Péter Pál Boda, once I find his e-mail...
One more thing - I changed the URL of our blog. Too bad talksense.blogger.com was already taken. Any other suggestions? Leave them in the comments. =)
TalkSense
New idea:
We would like to write an application that will help Alzheimer patients or other people with memory problems. We plan to have a program running 24/7 that would record clips when human voices are detected. These clips will then be uploaded to a server which will be stamped with a time and location. We also want to utilize the Bluetooth in the phone to "figure out" who you are having a conversation with. This may require the user to choose the person they are conversing with from a list of Bluetooth devices during the first time a Bluetooth is sensed. The phone will store this knowledge for future uses so it will be a one time entry for the user. This will enable a person to go back and search audio clips for conversations with specific people.
We would like to write an application that will help Alzheimer patients or other people with memory problems. We plan to have a program running 24/7 that would record clips when human voices are detected. These clips will then be uploaded to a server which will be stamped with a time and location. We also want to utilize the Bluetooth in the phone to "figure out" who you are having a conversation with. This may require the user to choose the person they are conversing with from a list of Bluetooth devices during the first time a Bluetooth is sensed. The phone will store this knowledge for future uses so it will be a one time entry for the user. This will enable a person to go back and search audio clips for conversations with specific people.
Thursday, May 1, 2008
Summary
We are planning on giving people a way to look back and see what a typical day in their lives looks like. This is useful for remembering past events that occurred without having to write anything down. The cell phone will be hung around a person's neck and will take constant pictures throughout a day. We will then look at a few different aspects to classify an event as "interesting" as we would not want to display every picture taken during the day. We will look at both location and time to see where the user spends most of his/her time and include a picture from those places. Additionally, we will do some image processing to try to reduce the number of duplicate pictures shown. We would also like to label certain dynamic events, such as a car crash that a user may focus on, as "interesting". Last, we will allow a person to customize what they believe is "interesting" in order to tailor the logs to the individual.
Next step: Write a 2 page proposal by Monday.
Subscribe to:
Posts (Atom)