Pelayo Arbués


Search IconIcon to open search

Data Science Fundamentals

Last updated Mar 6, 2023

Data Science Fundamentals is a collection of selected resources aimed at providing a solid background for aspiring and junior Data Scientists. The objective is to create a list as small and powerful as possible. There are thousands of sources out there and it is sometimes difficult to focus on what’s important.

I am afraid to say that there is no shortcut to becoming a professional Data Scientist. I don’t believe in one-month bootcamps and I find that most MS programs miss some key topics. They are usually too focused on technology and ML/DL algorithms, and often forget about other important things such as communicating results or providing a broad picture of the role of a data scientist in a project.

My main motivation is to create a comprehensive list of resources that will allow future Data Scientists to gain a deep knowledge of a few core competencies from which they can build up their careers. If you have a strong background in any of these competencies, you may still find some other useful stuff on the list. The list is alive, I want to keep it short so I could replace one course if I find something better. Competences should remain mostly the same.

The core competencies covered are:

# Resources

There are two levels:

MathsLinear AlgebraFoundation
MathsStatistical LearningFoundation
MathsStatistics 101 ProbabilityFoundation
MathsNumerical OptimizationRecommended
MathsTime Series AnalysisRecommended
MathsMachine Learning 101Recommended
CommunicationCommunicate with impactFoundation
CommunicationTechnical WritingFoundation
CommunicationData VisualizationFoundation
Data Science WorkflowGood Data AnalysisFoundation
Data Science WorkflowThe Data Science ProcessFoundation
Data Science WorkflowA B TestingRecommended
Data Science WorkflowCausal InferenceRecommended
Data Science WorkflowThe Ultimate Guide to Deploying ML ModelsRecommended
Data Science WorkflowRules of MLRecommended
Tools of the TradeSQLFoundation
Tools of the TradeProgramming LanguageFoundation
Tools of the TradeShell Script and othersFoundation
Tools of the TradeGitFoundation
Tools of the TradeIntroduction to Computer ScienceRecommended
Business UnderstandingOh OhFoundation
EthicsData Science EthicsFoundation

📫 If you have any suggestion do not hesitate to contact me via twitter at @pelayoarbues.


I have a preference to learn by reading books but I have tried to include video materials when there are a good alternatives.

In this list you might be missing some popular techniques such as Deep Learning, NLP and others. In my experience, these tools are not essential in a common project in a common company in a common industry. Besides being niche methods, it is quite unlikely that a newcomer will be handed one of this cool projects while more senior Data Scientists are usually doing unglamorous stuff and desperately willing to land a project in which to use any of these.