Mass project

Welcome to the Masking Ambient Speech Sounds project Wiki.

Experiments

Beta Test

The first listening tests will involve project staff members to check if things make sense. If it looks good we'll start working with non-project volunteers. Experiment 1, in the the CCRMA "Pit," will take about 30 mins. and involve 30 trials. There will be 6 conditions of masking sound crossed with 5 conditions of speech sounds. The masker (FM noise) and the speech sounds will be presented as if the sources are outside the room. We'll use the measured room model from Tokyo and the exterior sound source position (hallway). The "as if" impression will be created by convolving with the measured impulse responses.

Necessary ingredients: (x = done)

(x) ambient room sound recording from Tokyo
(x) 15 sec. recordings of FM noise masker with parameter variation
(x) 4 min. recordings of 4 conversations (animated / not-animated, crowd / pair, always 50% gender balance)
(x) 15 sec. clips cut from conversations
(x) convolved versions of 15 sec. files putting them "as if" in the hallway
(x) GUI for running randomized listening, A/B forced choice, logging results

Strategies to define conditions for FM masing noise

To define the conditions of this first experiment, the approach will be to leave all the parameters fixed, except the modulation frequency.

Noise set Contains a complete technical documentation of the masking noise generation. It also contains the soundfiles.

The conditions of the masking FM noise will be defined by the following criteria:

3 bands of FM noise will be used (centered at 200 350 and 500 Hz):
This bands are selected based on an analysis of speech voice recorded in the Tokyo office. The motivation behind this decision is to identify the relevant parameters in the leaking voice. For example, we know that the wall is filtering much of the high frequency components, so that's relevant in the selection of the main frequencies.
The amplitude (volume) of each band will be fixed:
The amplitude was tuned in order to psychoacoustically balance the level of the three noise bands that will be used. This balance was done without modulation.
The amplitude of the modulation will be proportional to the modulation frequency:
The motivation behind this choice is to minimize the annoyance effect. When the modulation rate is low, higher amplitudes are more noticed and annoying.
The relation between of modulation frequency of the 3 bands is then the main factor to define the conditions:
For this experiment, 3 modulation rates are selected, 2, 5 and 7 Hz. The idea is to span some of the frequencies in the range of 2 to 7 Hz. Basically, all the combination of these 3 rates are used for each center frequency, plus a case with no modulation at all.

--Jcaceres 17:09, 24 July 2006 (PDT)

Beta test TODOs

The beta-test of the experiment tool took longer than anticipated. Some minor fixes remain. The ones I remember from yesterday (Friday, 28th) and the ToDo list for Monday (x = done):

(x) delete input slider from bottom of GUI (in Qt Designer), final product should look like the picture above
(x) when user hits "OK, Next" button, clear all the radiobuttons, with radiobutton->setDown(false)
This worked out with setChecked(false) (inside a method, not in the connection)
(x) comment out all the "cout" statements that are printing during trials, except for the one that says "behind"
(x) find a sticky way to keep machine speed at max during trial (automatic energy saving may be the reason for the occasional stuttering)
Jason comments:

/usr/bin/cpufreq-selector -g performance

you will select the "performance" governor and the cpu speed should go to the max and stay there.

/usr/bin/cpufreq-selector -g userspace

will return the governor to the original "userspace" governor.

And:

/usr/bin/cpufreq-selector -f 1000000

will get the processor to the slow idle speed.

From there the speed should again be "on demand". Regretfully it looks like sometimes the background daemon ("cpuspeed") gets fooled by these changes and dies. At least you can control all of this manually.
(x) convert QString to const char for logger class file open (use const char * QString::latin1 ())
(x) create a "shuffle" sort method in MainDialog.cpp and apply it for the actual first test
I think each individual mono file repeating is ok, but I'm worried that they could slip out of sync. Don't know for sure. Better if the repeats for a group of four is from the first channel's repeat
add envelopes at all file starts, stops, repeats (with STK's Asymp class), pipe the file's output through it
I still need to add this, but I think after the first experiments, what we really need to do is make much longer files so they don't repeat, and the listener don't get a queu from that repetition.
IF I've created a problem for disk files keeping up, you will see the message "behind" printed from FileWvIn and it will start stuttering, the next fix to try (and this might be important anyway for our sanity) is to go to quad files rather than 4 mono files for each layer.
This doesn't look easy, I think I have to modify the entire Jukebox.cpp class in order to get this working...
(x) Add a dialog in case the user doesn't select an option.
(x) Change the silence always on A. Modify also the "correctness" of the selection, now is always set to be in A.
(x) Turn off Sounds (alternative A and B) when user goes to next case.
(x) Program is crashing at the end (it's quiting badly). If you go until the end, is not writing anything to the ouput files. If you stop it in the middle, it works. I may be probably a problem with some destructor...
I get the problem with these test files:

/usr/ccrma/snd/jcaceres/yamaha/recordings/experiments/experiment01/TEST/

the message is:

terminate called after throwing an instance of 'std::bad_alloc'

what(): St9bad_alloc

Aborted

It works fine with these set of files:

QString rootDir ("/usr/ccrma/snd/jcaceres/yamaha/recordings/experiments/experiment01/");

FOUND IT!!! It was a problem reading a vector in MainDialog::setTrial (int n)

_____________ there are probably more things I'm forgetting, but this is close
GOOD LUCK!

--Cc 09:42, 29 July 2006 (PDT)

Findings on the Beta Test

There is a low frequency of the voice that now is not beeing masked.
We need to use a really long conversation, that does never repeat during the experiment.
This corpus of conversations need to have "stationary" properties.

Bottom lines

We're going to use just one room (Tokyo Office)
We keep the 4CH setup.
Spatialization ???

Conference Call Meetings

July 18, 2006

FM Modulation discussion (Yasushi's Comments, with Juan-Pablo's comment on answer A:):

Do you have any idea how to specify frequency modulation for each frequency band?
- A: based on speech freq, ~2-8 Hz
The period in time for each frequency should be the same?
- A: No, different. When it's the same the masking efficiency decreases. It seems also more anoying.
Modulation speed will be getting faster according to higher frequency, or
- A: I don't know yet, this is going to be the main parameter in the first experiment I think.
The frequency modulation considering the voice sound
We have to analyze how the voice sound is modulated in different frequency bands?
- A: I thiks this is the best way, and we have to consider that the wall is filtering almost all the high frequencies.

Discussion of the experiment setup.

Look at the documentation, the new example of impulse responses, and delay of arrival.

July 24, 2006

Tuesday 9:30AM Japan - Monday 5:30PM Stanford

Discuss Experiment 1.
Ask Atsuko about calibration files and SPL meeter.
Comment diffusion in the Pit with PZM system (Hiroko).
Discuss Experiment Design writen by Hiroko and Atsuko.

July 31, 2006

Tuesday 9:30AM Japan - Monday 5:30PM Stanford

Discuss Experiment Design writen by Hiroko and Atsuko.
Explain experiment setup.
Discuss Atsuko's agenda at CCRMA.
Goals for this week are to finsih the setup (C++ and pit room) and collect and analyse some data in a couple of subjects.

Parameters for the Noise Generation

Variables

modulation width (critical band or speech sounds)
modulation rate (0.01 - 0.1 fc)
sinusoidal or stochastic modulation

Already fixed

with broadband noise (what shape, and how loud? - according to the speech)
band width of the noise (critical band)
amplitude of each channel (speech sounds spectral distribution)
number and frequency of center frequencies (3)

--Hiroko 18:27, 31 July 2006 (PDT)

Experiment design

masking sounds with three variables
one sec. noise+speech, noise only comparison
all stimuli mixed and randomized

problem of the beta-test

repeatable, long conversation -> Fixed length, no repeat
either one of two stimuli is expected with speech

what experiment to do

efficiency test
- Stimuli: Noise+speech (5 vowels w/ consonants) + no sound (noise only), one second stimuli, no repeat.
- Question: "Is there a speech or not?" The answer is "Presence" or "Absence"
- Masker: 30 Noises - Juan pablo defines these 30 kinds by whatever explainable strategy.
- Speech: 5 words by a male and 5 words by a female. A word consists of a syllable and 5 words for 5 vowels.
- Pros: very short and covers many stimuli.

intelligibility test
- Speech sounds: Idiomatic phrases and isolated words (Each masking sound has a phrase and word) - TBD
- 4 sec per stimuli (15 stimuli/min)
- Measure audibility and intelligibility thresholds
- The answers the subject choose from are: "I don't hear speech" "Speech is audible" "Speech is intelligible"
- complete random order (beyond the group of (1) sentense/words (2) masking noise types and (3) playback level)
- we prohibit a sequencial presentation of a same stimuli from intelligible to less intelligible.
- analysis - we do not check the correctness - we only measure the intelligible impression

annoyance test

Atsuko's visit Agenda

Friday August 4th,
1pm Meeting (listening room)

5:30pm - Conference Call Japan
Saturday August 5th
Noise, narrowing parameters.
Sunday August 6th
Meetings with Jonathan Berger and Hiroko
MondayAugust 7th
Pscychoacoustic generic tests (Hiroko)

Brain storm spatializtion parts - experiments strategies

Meeting, Jonathan Berger, Jason, Juan pablo, Atsuko and Hiroko.
Wednesday 9th,
1pm - Meeting

5:30pm - Conference Call Japan

Links

MASS Technical documentation, we are generating this documentation from the Matlab scripts. All the functions created are also documented.
Mass project - support materials by Hiroko, with pictures, sounds and PDF documents on psychoacoustic experiment.
Experiment C++ Source Code Documentation

Mass project

Contents