
The Science
IN PROCESS
Acoustic Features to Health Indicators
Running algorithms now: 7
Next to be added: 12
Disclaimer:
The relationships described below are based on existing voice research and are intended to guide the development of analytical tools. These acoustic features are potential indicators and not diagnostic tools on their own. Clinical diagnosis requires comprehensive assessment by qualified healthcare professionals.
Linking to Specific Research Goals:
Alzheimer's, Parkinson's, Dementia (Neurodegenerative Conditions):
- Key indicators: Changes in F0 variability (monopitch), intensity (hypophonia), jitter/shimmer, HNR, formant centralization (vowel space reduction), speech rate, and pause patterns.
- Longitudinal tracking of these features is crucial for early detection and monitoring progression.
PTSD & Stress:
- Key indicators: Alterations in mean F0, F0 variability, intensity variability, speech rate, pause frequency, and potentially MFCC patterns for emotional valence.
- Acute vs. chronic stress might show different vocal signatures.
Fatigue and Energy Levels:
- Key indicators: Reduced mean F0, reduced F0 variability, reduced intensity, increased jitter/shimmer, reduced HNR, slower speech rate, increased pauses.
- These often reflect decreased neuromuscular control and vocal effort capacity.
1. Fundamental Frequency (Pitch) Features
Mean F0:- Parkinson's Disease: Often associated with monopitch (reduced F0 variation) or abnormally high/low mean F0.
- Alzheimer's/Dementia: May show changes, though less consistently defined than in Parkinson's; could relate to apathy or agitation.
- PTSD/Stress: Can be elevated in acute stress or show reduced variability in chronic stress/depression.
- Fatigue: May lead to lower mean F0 or increased variability as control diminishes.
F0 Standard Deviation (SD F0) / Range (Prosodic Variability):
- Parkinson's Disease: Significantly reduced SD F0 (monopitch) is a hallmark.
- Alzheimer's/Dementia: Reduced prosodic variability can indicate cognitive decline and affective flattening.
- PTSD/Stress: May be reduced (monotonous speech) in depression/PTSD, or exaggerated in agitated states.
- Fatigue: Can lead to reduced F0 variability due to muscular effort and reduced control.
Voiced/Unvoiced Segments:
- General Neurological Conditions: Changes in the proportion or duration of voiced/unvoiced segments can reflect articulatory difficulties or changes in speech planning.
2. Intensity (Loudness) Features:
Mean Intensity:- Parkinson's Disease: Often reduced intensity (hypophonia) is a key symptom.
- Alzheimer's/Dementia: Can vary; some may speak softer, others louder due to hearing issues or agitation.
- PTSD/Stress: May be lower in depressive states or higher in agitated states.
- Fatigue: Typically leads to reduced vocal intensity.
Intensity Standard Deviation (SD Intensity) / Range:
- Parkinson's Disease: Reduced intensity variability (monoloudness).
- Alzheimer's/Dementia: Can be reduced, reflecting flattened affect.
- PTSD/Stress: Similar to F0 variability, can be reduced or exaggerated.
- Fatigue: Reduced ability to modulate loudness, leading to lower SD Intensity.
3. Jitter and Shimmer (Vocal Stability):
Jitter (Frequency Perturbation):- Parkinson's Disease: Often elevated, indicating vocal instability and hoarseness.
- Alzheimer's/Dementia: May be elevated, reflecting laryngeal control issues.
- General Neurological Conditions: Increased jitter is a common sign of vocal fold irregularity.
- Stress/Fatigue: Can increase due to muscle tension or reduced neuromuscular control.
Shimmer (Amplitude Perturbation):
- Parkinson's Disease: Often elevated, contributing to perception of hoarse or breathy voice.
- Alzheimer's/Dementia: May be elevated.
- General Neurological Conditions: Increased shimmer indicates instability in vocal fold vibration amplitude.
- Stress/Fatigue: Can increase.
4. Harmonicity Features (e.g., Harmonics-to-Noise Ratio - HNR):
HNR:- Parkinson's Disease: Often reduced, indicating a more noisy or breathy voice quality due to incomplete glottal closure or irregular vocal fold vibration.
- Alzheimer's/Dementia: May be reduced, associated with overall vocal decline.
- General Voice Disorders/Stress: Lower HNR can indicate various laryngeal pathologies or increased laryngeal tension.
- Fatigue: Can lead to reduced HNR due to less efficient vocal production.
5. Formant Features (Vocal Tract Resonances):
Formant Frequencies (F1, F2, F3, etc.) and Bandwidths:- Parkinson's Disease: Reduced vowel space (e.g., centralization of vowels, imprecise articulation) can be reflected in formant shifts.
- Alzheimer's/Dementia: Imprecise articulation due to cognitive-motor decline can alter formant structures.
- General Motor Speech Disorders: Changes in formant patterns are key indicators of articulatory imprecision.
Vowel Articulation Index / Formant Centralization Ratio:
Derived measures from formants that can quantify articulatory precision.
6. Speech Rate and Pause Analysis:
Speech Rate (e.g., syllables/second, words/minute):- Parkinson's Disease: Can be variable; some experience slow speech (bradykinesia), others festinating speech (accelerated, then trailing off).
- Alzheimer's/Dementia: Often slower speech rate, increased pauses, and word-finding difficulties reflected in timing.
- PTSD/Stress: Can be faster in anxiety, slower in depression. Hesitations and pauses may increase.
- Fatigue: Typically leads to slower speech rate and more frequent/longer pauses.
Pause Characteristics (Frequency, Duration, Filled/Unfilled Pauses):
- Alzheimer's/Dementia: Increased frequency and duration of pauses, often related to anomia or cognitive processing delays.
- Parkinson's Disease: Inappropriate silences or difficulty initiating speech.
- PTSD/Stress: Increased hesitations, filled pauses (um, uh) can indicate anxiety or cognitive load.
7. Mel-Frequency Cepstral Coefficients (MFCCs):
MFCCs: While not directly interpretable like F0 or intensity, MFCCs capture overall spectral envelope characteristics. Changes in MFCC patterns, often analyzed with machine learning, can be sensitive to:- Neurological Conditions: Subtle changes in articulation, vocal tract configuration, and voice quality that differentiate patient groups from controls.
- Emotional State/Stress: MFCCs are used in emotion recognition from speech.
- Fatigue: Changes in vocal effort and articulation can be reflected in MFCCs.
8. Spectral Features (e.g., LTAS, Spectral Tilt):
Long-Term Average Spectrum (LTAS) / Spectral Tilt:- Parkinson's Disease: Can show changes reflecting altered voice quality (e.g., reduced high-frequency energy).
- Voice Quality Assessment: Spectral tilt is related to perceived breathiness or strain.
- Fatigue/Effort: Increased vocal effort can alter spectral balance.