DScience:Introduction to data science

From HandWiki
Limitted access. First login to DataMelt if you are a full DataMelt member. Then login to HandWiki as a user.


40% complete
   


Introduction to data science

Data science Data science is an interdisciplinary field that uses various scientific methods, algorithms and computer programs to extract knowledge from data in various forms. Data science professionals understand that they must advance the traditional skills of analyzing large amounts of data, data mining, and programming skills.

This courses introduces the main ideas and algorithms used by the data scientists. It gives an overview of statistics, shows how to deal with large data using various algorithms and their implementations. More importantly, it give a large volume of code xamples that help understand the data science methods.

This course is designed keeping in mind the popularity of Java, the word's #1 programming language (see TIOBE popularity Index), and the popularity of Python, which the most popular scripting language used in data science. Therefore, we will use the DataMelt DataMelt program which combines these two programming languages in a single framework.

For best understanding of this course, it is advisable to know the basics of Java and Python. Some tutorials are provided by jWork.ORG tutorials. In particular, it is useful to read the interactive Python tutorial before reading this course.

This tutorial is provided under this license agreement.