by Dr. Fei-Fei Li
Instead of refining the processing engine (the model/algorithm), this methodology shifts the entire focus to the fuel (the data). It involves identifying a 'North Star' problem (e.g., Object Recognition) and hypothesizing that the solution lies in the scale and granularity of the input data rather than the complexity of the processing logic.
Core Principles
- 1.Identify the North Star Problem: Choose a fundamental capability that is currently broken or primitive (e.g., 'Computers cannot see objects').
- 2.Hypothesize the Missing Ingredient: If current models fail, assume the deficit is experiential/data-based, not just logic-based.
- 3.Aggressive Data Scaling: Move from thousands of examples to millions (ImageNet went to 15M images). Scale is a quality of its own.
- +1 more...
"It dawned on me that human learning as well as evolution is actually a big data learning process... I think my students and I conjectured that a very critically-overlooked ingredient of bringing AI to life is big data."