Nic Hug, Associate Research Scientist, Enhances Scikit-Learn
In Andreas Mueller, the Data Science Institute has one of the nation’s most prominent scikit-learn core developers. And now that two of his team members have been named core developers, DSI has become the go-to place for questions on how to use scikit-learn, the immensely popular open source machine-learning library.
Joining Mueller as core developers for scikit-learn are DSI Postdoctoral Researcher Nichola Hug, and Thomas Fan, a software developer. Both Hug and Fan were nominated by existing core scikit-learn developers and recently voted and approved to work as core developers, granting them voting rights on design decisions of the project.
“DSI now has one of the biggest and best team of scikit-learn core developers in the world,” says Mueller, an associate research scientist at DSI and author of the book, Introduction to Machine Learning with Python. “They have both contributed in enhancing the library and making it more user friendly, so that researchers from all fields, not just techlionical fields, can use and benefit from scikit- learn.”
Working with Mueller to develop scikit-learn, Hug and Fan have excelled at maintaining various important aspects of the library. Core developers review code contributions, merge approved pull requests, and guide the development of the library by weighing in on major changes to the application program interface.
In his research, Hug focuses on integrating automatic machine learning tools to scikit-learn. He uses gradient boosting trees to make a family of algorithms run much faster, enhancing their time from around five minutes to about five seconds. The algorithms can now be trained much faster and offer quicker predictions. He also helps users who have questions or need guidance, which oftentimes means he reviews code and finds bugs. His work improves the library and helps build a larger community of library users.
he worked to build the documentation at scikit-learn.org, a manual for using scikit-learn, and enhanced the caching and downloading speeds for