AI & Machine Learning

Vintage Film AI Training Dataset

License a large-scale dataset of vintage 8mm and Super 8 archival footage for computer vision, generative AI, and temporal analysis research. Over 217,560 digitized clips with structured metadata spanning the 1930s–1980s.

217,560 clips~800,000 still frames396+ hours129 countries

Key Stats

217,560+Video Clips
~800KStill Images
396+Hours of Footage
129Countries
1,435+Cities
1930s–1980sTemporal Range

Dataset Contents

Each clip record includes the following structured metadata fields:

Metadata coverage across the full dataset:

Keywords100%
Description100%

Film Format Breakdown

All footage in the collection was originally shot on analog film stock:

8mm159,495 clips
Super 81,146 clips

Decade Coverage

Distribution of clips by the decade in which they were shot:

1930s
7,867
1940s
10,573
1950s
34,294
1960s
68,156
1970s
32,001
1980s
7,554

Geographic Coverage

Footage spans 129 countries and 1,435+ cities. Top 15 countries by clip count:

United States
140,367
Mexico
41,029
Canada
3,881
France
2,977
Italy
1,658
India
1,588
Kenya
1,571
England
1,527
Russia
1,405
Denmark
1,305
Spain
1,298
Japan
1,144
South Africa
1,027
Greece
969
Israel
938

Still Images

In addition to video clips, the dataset includes approximately 800,000 still frame extractions. These high-resolution images are derived from key frames across the archive and can be used independently for image classification, object detection, and visual similarity research.

Use Cases

Computer Vision

Train object detection, scene classification, and activity recognition models on authentic mid-century imagery.

Generative AI

Fine-tune video and image generation models to produce realistic vintage aesthetics including film grain, color shifts, and period artifacts.

Temporal Analysis

Study visual changes over decades across consistent geographic locations, fashion, architecture, and urban landscapes.

Multimodal Research

Pair rich textual metadata with visual content for vision-language model training and cross-modal retrieval.

Cultural Preservation

Develop AI tools for automated restoration, colorization, and cataloging of historical film archives.

Case Study: Paris, France 1947

AI-generated image trained on 1947 Paris, France archival footage

To test the dataset’s potential for generative AI, we trained a model on the likeness of a specific genre — targeting Paris, France circa 1947 using approximately 2,500 still images extracted from the collection. The results were remarkable: the generated images were nearly indistinguishable from authentic archival footage, capturing the film grain, color palette, and period atmosphere with striking accuracy.

Browse the source collection at 1947 Paris France to see the original training material. The trained model is open-sourced and available for download — try it yourself and see the results firsthand at CivitAI.

Frequently Asked Questions

Video clips are available as MP4 files. Metadata can be delivered as CSV, JSON, or via API access. Still frame extractions are available as high-resolution JPEG or PNG files.

Yes. We offer custom subsets filtered by decade, geography, format, or keyword. Contact us to discuss your specific requirements.

We offer both research/academic and commercial licenses. Research licenses include usage for non-commercial AI training and academic publications. Commercial licenses cover production use in AI products and services.

All metadata fields are structured and machine-readable. Keywords are tokenized, locations are normalized, and temporal data follows consistent formatting for easy integration with ML pipelines.

Clips are digitized from original 8mm and Super 8 film stock. Resolution varies by source but typically ranges from 480p to 720p, preserving authentic film grain and period characteristics.

License This Dataset

Interested in licensing the Stockfilm dataset for AI training, academic research, or commercial applications? Contact us to discuss pricing, delivery formats, and custom subsets.

Contact Us