Special Topics in Data Analytics and Modeling National Taiwan University
Data is at the center of the so-called fourth paradigm of scientific research that will spawn new sciences useful to the society. Data is also the new and extremely strong driving force behind many present-day applications, such as smart city, manufacturing informatics, and societal security, to name a few. It is thus imperative that our students know how to handle data, analyze data, use data and draw insights from data. This course aims at acquainting the students with the analytical foundation of data handling techniques. The course consists of a series of seminar talks with substantial student participation, in the form of research and presentation in response to posted questions about main topics in data analytics and modeling. 1. Scope Broad topics covered in the course include: •Regression & curve fitting •Probability distribution & parameter estimation •Mixture models, latent variable models & hybrid distributions •Hidden Markov models, Markov random fields, & graphic models •Pattern recognition & decision theory •Neural networks and deep learning Well spend 2-3 weeks on each topic (some may take up to 4 weeks). 2. Format For each topic, a number of questions to help students learn the subject will be posted in advance. Individual student will be assigned to conduct research, answer specific questions and return with presentations to the class. Each student presentation is of duration ~20 min, followed by ~10 min questions and discussion. Students who are assigned to address specific questions have one week time to prepare for the presentation. Common questions shared by all topics are: - What are the problems that gave rise to the particular topic & concept? (The original motivation) - What problems beyond the original motivation will the topic and the related techniques be able to solve? (New and novel applications) - What are the problem formulations with relevant assumptions that have been proposed? (The methodology and formulation) - What are the ensemble of techniques that were developed to solve the problem? (The tools and capabilities) - How do these techniques solve the problem or contribute to the solutions? (The solution mechanism) - What are the limitations of the solutions proposed so far? Any remaining open problems in the topic? (Research opportunities) In addition to these common questions, some topic-specific questions may also be posted and addressed in student presentations. After all posted questions about a subject are addressed in student presentations, one or two commentary sessions by the lecturer on the subject will follow so as to complete the systematic development of understanding of the subject. The course will be primarily conducted in English. To reflect the applicability of the subject matter to local problems, local languages may also be used as the circumstance calls for it. No official textbook is assigned in this course. Students are expected to conduct research with all university provided resources (e.g., books in the library) and information available on the web. Class notes by the lecturer will be distributed in due course. 3. Prerequisite Both graduate and undergraduate students can enroll in the class, as long as they have completed engineering mathematics courses, particularly Probability and Statistics or the equivalent.
Overall, students will be exposed to data analytic topics and their historical perspectives, learn to ask and analyze related problems, understand the modeling techniques and their origins, and conceive of new applications and research opportunities.
No written test will be given in the special course. Student presentations are evaluated by the class and moderated by the lecturer.
Online Course Requirement