The Certification of Professional Achievement in Data Sciences prepares students to expand their career prospects or change career paths by developing foundational data science skills. Join us from anywhere in the world as the program is now also offered online.
This program is jointly offered in collaboration with the Graduate School of Arts & Science's Department of Statistics, and The Fu Foundation School of Engineering & Applied Science's Department of Computer Science and Department of Industrial Engineering & Operations Research.
- Undergraduate degree
- Prior quantitative coursework (calculus, linear algebra, etc.)
- Prior introductory to computer programming coursework
- Official transcript copies from every post-secondary institution attended
- Three recommendation letters
- Personal statement
- Curriculum vitae / resumé
- Non-refundable application fee ($85.00 for on-campus program, $150 for online program)
We routinely offer a number of online information sessions and other recruiting events, please [Click Here]. To learn more about the admissions application requirements, please visit the Office of Graduate Student Affairs.
The priority deadline for Fall application submissions is February 15th.
Online Program: [Apply Here]
TUITION AND FEES
Students enrolled in the Certification of Professional Achievement program pay Columbia Engineering's rate of tuition. Tuition and fees are prescribed by statute and are subject to change at the discretion of the Trustees. For more information on rates of tuition and other applicable fees, refer to Student Financial Services and the Columbia Engineering Bulletin. Note the online Certification program has an additional non-refundable technology fee of $395 per course.
Candidates for the Certification of Professional Achievement in Data Sciences, a non-degree part-time program, are required to complete a minimum of 12 credits, including four required courses:
For the most up-to-date course offering and schedule information refer to COURSES.
The required Certification courses may be eligible for advance standing towards the Master of Science in Data Science program upon admission to the Master of Science in Data Science program. Since Columbia University's policy prohibits the double counting of coursework between programs, Certification students admitted to and enrolled in the Master of Science program will forego their Certification in order to allow these courses to count towards their Master of Science.
CSOR W4246 ALGORITHMS FOR DATA SCIENCE
Prerequisites: basic knowledge in programming (e.g., at the level of COMS W1007), a basic grounding in calculus and linear algebra.
Methods for organizing data, e.g. hashing, trees, queues, lists,priority queues. Streaming algorithms for computing statistics on the data. Sorting and searching. Basic graph models and algorithms for searching, shortest paths, and matching. Dynamic programming. Linear and convex programming. Floating point arithmetic, stability of numerical algorithms, Eigenvalues, singular values, PCA, gradient descent, stochastic gradient descent, and block coordinate descent. Conjugate gradient, Newton and quasi-Newton methods. Large scale applications from signal processing, collaborative filtering, recommendations systems, etc.
STAT GR5701 PROBABILITY AND STATISTICS FOR DATA SCIENCE
This course covers the following topics: Fundamentals of probability theory and statistical inference used in data science; Probabilistic models, random variables, useful distributions, expectations, law of large numbers, central limit theorem; Statistical inference; point and confidence interval estimation, hypothesis tests, linear regression.
COMS W4721 MACHINE LEARNING FOR DATA SCIENCE
Prerequisites: Background in linear algebra and probability and statistics.
An introduction to machine learning, with an emphasis on data science. Topics will include least squares methods, Gaussian distributions, linear classification, linear regression, maximum likelihood, exponential family distributions, Bayesian networks, Bayesian inference, mixture models, the EM algorithm, graphical models, hidden Markov models, support vector machines, and kernel methods. Part of the course will be focused on methods and problems relevant to big data problems.
STAT GR5702 EXPLORATORY DATA ANALYSIS AND VISUALIZATION
Fundamentals of data visualization, layered grammer of graphics, perception of discrete and continuous variables, introduction to Mondran, mosaic pots, parallel coordinate plots, introduction to ggobi, linked pots, brushing, dynamic graphics, model visualization, clustering and classification.