This is our catalog of courses. We will occasionally adjust the course listing to reflect the addition of new courses and the retirement of others.
This valuable course is the first in a two-semester exploration of many topics associated with data science. In many industries – agriculture, medical fields, cyber-security, manufacturing, and more – and from within the small-scale family business to big-data corporations like Google, the availability of data is almost everywhere.
Total classes: 15
Prerequisite: An understanding of algebra is recommended for an understanding of polynomial equations, algebraic reasoning, and problem-solving.
An understanding of matrix mathematics and statistics is helpful but NOT required – they will be discussed in the lectures.
Previous computer programming experience — Python programming preferred but other programming languages are acceptable. Computer Programming 101 (available as a recorded course through Unlimited Access) and/or Introduction to Computer Science (also available as a recorded course through Unlimited Access) would provide sufficient prerequisite experience. Much of the analysis will take place using Python-based computer programs.
General familiarity with computers including the ability to open applications, use menu-driven commands, and type using the keyboard so that the emphasis of the lessons is on specific programming assignments and related data-science topics
Suggested grade level: 9th to 12th
Suggested credit: One full semester Computer Science or Math
This valuable course is the first in a two-semester exploration of many topics associated with data science. In many industries – agriculture, medical fields, cyber-security, manufacturing, and more – and from within the small-scale family business to big-data corporations like Google, the availability of data is almost everywhere. The ability to work with that data to gain insights into correlations, the visualization of that data in a variety of charts and plots, to be able to identify data that appears to be an outlier from the larger dataset and/or from the trends, and to predict future outcomes based upon variable inputs, these are all just some of the ways that data is used to assist people in determining valuable insights in otherwise chaotic and disconnected pieces of information.
Because data science can be applied to so many working environments, the study of it is no longer just limited to those who are interested in a career in Information Technology (IT). Data science is becoming one of the fastest growing professional careers available because of its ability to find a “home” in so many industries.
Topics subject to minor changes. Topics will be interspersed throughout lectures and will span multiple weeks.
What is it?
Who uses it?
Workflows and methodologies used by data scientists
Python programming for data science
The development environment (Anaconda, Jupyter Notebooks, and Spyder)
Review of Python programming fundamentals and Python data types (variables, lists, dictionaries, etc.)
Python functions and some of the Python modules we will be using (Pandas, NumPy, scikit-learn, and more)
Exploring data sets of various types (sales data, website visitor logs, user profile data, etc.)
Cleaning “dirty” datasets
Review of (or introduction to) statistical math methods
Data visualization in Python and spreadsheet applications
Course Materials: All course materials are to be provided by the professor. Software to be installed — Anaconda (https://www.anaconda.com) with Python 2.7 version (NOT Anaconda with Python 3.x version) which is available for Windows, Mac, and Linux operating systems. Within Anaconda, ensure that the Jupyter Notebook and Spyder add-in applications are installed. The open source Anaconda Distribution is the easiest way to do Python data science and machine learning.
Homework: Computer-generated quizzes, at-home analytical exercises, and exploration of methodologies applied towards items of personal interest. Spreadsheet applications like Microsoft Excel and/or Open Office (https://www.openoffice.org) may also be utilized. Students can expect 2-6 hours of studies outside of class depending upon their proficiency with programming in Python and their previous familiarity with algebra, matrix mathematics, and statistics. If some of the math is new, then naturally there’s time that would need to be spent on learning math before it can be effectively programmed.