Simplification of VocaVista

    Simplification of VocaVista

    The above is a reference to a Dropbox image that includes the three major boxes in VoceVista. If you read the panel below the image you will probably see why it needs clarification. It is a triple illustration of a soprano voice singing the eight notes as described.

    The duration of the series is a little short of 9 seconds. How do I know this? The adjustable cursor, the vertical line, is set at 4168 milli-seconds from the start of the scan which is the LEFT of both left boxes. No doubt you'd prefer to be told the cursor is just over 4 seconds from the start and that the scan will freeze when it gets to 8.336 seconds? Right? Just short of 9 seconds.

    The top left box is a very ordinary display of the loudness of the sound from start to finish, all 9 seconds of it. The base line that runs horizontally through its center represents silence and the bigger the vertical excursion above and below the center, the louder the sound. In this display the singer gets louder from start to finish. This is not a very useful display.

    The left lower box needs some explanation mainly because it is faint in places. I have increased the contrast of this shot and enlarged it so that some of the important detail is more obvious. Namely the separating of the eight notes. The scale is variable so we will stay with the setting used here.

    The operator has wisely chosen to give his attention to sounds below 5000 Hz or cycles per second.
    That's the reason for the words, Spectrogram 5kHz, it displays the upper-most limit of the box display and can be altered.

    If the box had its X and Y quantities marked, it would have time in seconds along the lower edge and frequency in kHz along the left vertical edge, with ZERO at the bottom and 5kHz at the top. Now that you KNOW the lowest frequencies are at the bottom of the display, you can point to the lowest wriggly line and confidently say to yourself, "that is displaying the fundamental frequency of each of the eight notes."

    Now that you can barely see each of the eight notes, you may look at some of the upper lines and observe that as well as the fundamental you have SIX harmonics displayed. You can even see the vibrato as the line deviates slightly above and below its center axis. The note steps are also more obvious in the harmonics. Take notice also that the darker the line the more intense or louder the sound, so clearly the magnitude of the harmonics is reducing relative to the fundamental, [also called harmonic 1] as they become more dim.

    So to summarize; we go from the lowest line being the fundamental tone, to the second harmonic, then the third harmonic, then the fourth harmonic, then the fifth harmonic and then the sixth harmonic of EACH of the EIGHT notes. The movable cursor, the vertical line, is set at the end of the fifth note and designated, 1385 Hz. (1385 Hz is the second harmonic of the fundamental.) It may be of value to mention that the fundamental of two octaves would be below the lowest two lines.

    The remaining box on the right needs more explanation. It is another way of displaying the information as found in the left box but with more detail.

    Notice that there are also seven peaks, the same as the number of lines as in the left display.

    The left most is the FUNDAMENTAL {unfortunately also called the 1st Harmonic.} from then to the right we have the 2nd/3rd/4th/5th and 6th harmonics. If you look carefully on the left lower display you will see a small line marking which of the harmonics is being investigated in the right box.

    This POWER Spectrum is also limited to 5kHz by the presets and you will notice from the reference to minus 15 dB, that the operation of the device uses the loudest sound as the reference. This is unfortunate and non-standard. The standard loudness reference is associated with the softest sound definable and is described as 0 dBa. Pronounced, zero D-B-A. This is the level of sound created by a cricket as it spring-boards from one leaf to the next as we listen from 6 feet above on an extremely quiet night. 100dBa is the sound of a freight train rumbling past at speed.

    More practically two people calmly conversing at 4 feet will produce a sound level of about 72 dBa while a heated exchange may be 3 dB higher. 3db lower than the 72dBa reference indicates gossip or confidentiality. So there's a range of 69dba to 75dBa, also described as +/- 3db. This could also be said "plus or minus 3db with reference to 72 dBa." So it should be fairly obvious that a change of 3dB means TWICE as loud or HALF as loud when compared with the reference.

    So when the operator says the second harmonic is minus 15 db relative to the noise a racing freight train is making at say 50 feet, it may have been more useful and consistent to synthesize an artificial reference of conversation level, calling it NORMAL? Then we would have a reference we could all relate to. So notice that the third and fourth harmonics are at about NORMAL level while the second harmonic is about 12 db higher and knowing that every 3db means double, the second harmonic is about FOUR times louder than normal speaking level.


    For those using Firefox, it is easiest to open two of them and have the image in one and the site in the other. As you may know Win7 offers duel images from which to choose. Otherwise you will lose one while you look at the other.
