← Back to Paper List

Time-MMD: Multi-Domain Multimodal Dataset for Time Series Analysis

Haoxin Liu, Shangqing Xu, Zhiyuan Zhao, Lingkai Kong, Harshavardhan Kamarthi, Aditya B. Sasanur, Megha Sharma, Jiaming Cui, Qingsong Wen, Chao Zhang, B. A. Prakash
Georgia Institute of Technology
Neural Information Processing Systems (2024)
MM Benchmark

📝 Paper Summary

Multimodal Time Series Analysis Time Series Forecasting Dataset Construction
Time-MMD is the first diverse multi-domain multimodal time-series dataset, aligned with a new forecasting library (MM-TSFlib) to demonstrate that integrating textual data significantly improves forecasting accuracy.
Core Problem
Existing multimodal time series datasets are narrow (mostly financial), effectively misaligned (irrelevant text), and contaminated (contain predictions or data leaks), preventing effective multimodal analysis.
Why it matters:
  • Real-world experts (e.g., epidemiologists) use text/policies alongside numbers, but current models are largely unimodal (numerical only).
  • Current datasets focus solely on stock prediction, failing to capture diverse patterns like periodicity or sparsity found in other domains.
  • Data contamination in existing sets (e.g., text containing future predictions) leads to biased evaluations of Large Language Model (LLM) based forecasters.
Concrete Example: In epidemiology, a 'weekly influenza report' might contain a section explicitly predicting next week's outlook. If a model trains on this raw text, it cheats by seeing the answer. Time-MMD uses LLMs to disentangle facts from predictions to prevent this leakage.
Key Novelty
Diverse Domain Coverage with LLM-Curated Alignment
  • Expands beyond finance to 9 domains (Health, Economics, Energy, etc.) with diverse temporal patterns.
  • Uses an LLM-based pipeline to filter irrelevant text and crucially separate 'facts' from 'predictions' to prevent data leakage.
  • Introduces a standardized binary timestamp system to align asynchronous textual reports (e.g., monthly) with numerical data (e.g., weekly).
Architecture
Architecture Figure Figure 6
The Multimodal Integration Framework used in MM-TSFlib.
Evaluation Highlights
  • Multimodal models outperformed unimodal baselines in 95% of over 1,000 experiments.
  • Achieved over 15% Mean Squared Error (MSE) reduction generally across domains.
  • Up to 40% MSE reduction in domains with rich textual data, validating the quality of the aligned text.
Breakthrough Assessment
8/10
Significant infrastructure contribution. Moving multimodal time series beyond just stock prediction is a major step. The rigorous decontamination pipeline addresses a critical flaw in previous datasets.
×