This course will help participants:
§ understand key concepts in data science and their real-world applications.
- Explain how data is collected, managed and stored for data science.
- Plan and generate visualizations from data using Python and Bokeh.
- Work effectively with live data and utilize the opportunities presented by cloud services.
The course is broken into five days.
In day 1, you will get "hands-on" experience of Jupyter, the web-based environment which you will use for the course exercises and assignments. This day also contains a Python Primer activity for those of who are unfamiliar with the programming language or would like a refresher.
In day 2, you will learn about the fundamental terminology and processes in data science, discovering the technology landscape that has helped fuel the data explosion, and the tools that data scientists use to unlock the hidden value in these vast amounts of data. This day also contains an introduction to using Python for data science.
You will begin gaining hands-on experience of data science in day 3, focusing on collecting, storing and managing data, and you will learn about the different sources of data and how they can be combined in order to increase the potential insights available.
Day 4 will then help you understand how this data is analyzed, covering a range of techniques that a data science team would typically use, from statistics to machine learning. You will use Python to apply these analytical techniques to a real-world dataset.
In day 5, you will learn about how the findings from data science work can be reported using different data visualization techniques. You will discover the various ways in which particular types of data can be displayed in order to highlight a key finding and improve the impact of your reports.
Hands-on experience and assignments
Each day contains a mix of taught material, self-study material, activities and practical online exercises.
Day 1 includes an (optional) introduction/'refresher' on Python. This includes online exercises for you to work through at your own pace. This not graded. Day 2 contains further Python practice exercises. You are encouraged to do these as these will help you with your assignments. Again, these practice exercises are not graded.
Day 3, 4 and 5 each include online exercises (ungraded) and a related graded coursework assignment.
Aims and learning outcomes
This course will provide you with the knowledge and expertise to become a proficient data scientist.
Having successfully completed this module, you will be able to:
- Understand the key concepts in data science, including their real-world applications and the toolkit used by data scientists;
- Explain how data is collected, managed and stored for data science;
- Implement data collection and management scripts using MongoDB;
- Demonstrate an understanding of statistics and machine learning concepts that are vital for data science;
- Produce Python code to statistically analyze a dataset;
- Critically evaluate data visualizations based on their design and use for communicating stories from data;
§ Plan and generate visualizations from data