28/04/2026
Getting started with a project in the domain of your choice has never been easier. "I want to work on a project in banking, but I don't know where to find the data." Here are some top sources -
HuggingFace Datasets allows you search by modalities, size (rows), file format (select a soundfolder if you want!), find traces or benchmarks for GenAI stuff too.
Kaggle similarly has tons of data and allows you to search by file size, file format, usability rating, application, licences.
Google dataset search lets you search for various formats, topics, usage rights, paid vs. free and lets you search specific sites too.
These aren't the only ones, but you'll likely find everything you need. Then there are also the UCI ML repository and many other platforms. Just get started! And share with your friends who are still looking for data to get started! Happy learning and building!
01/03/2026