Speaker: Ms. Shahana Ibrahim
From: Oregon State University
Abstract
With the rise of the AI revolution, there is an exceptionally high demand for data labeling. However, most data labeling is done through crowdsourcing techniques, where non-expert annotators often provide labels. This results in considerably noisy labels, which severely degrades the performance of AI systems trained on such data. This talk introduces a range of performance-guaranteed approaches aimed at efficiently learning from noisy crowdsourced labels. The talk begins by revisiting the Dawid-Skene model---one of the most influential models in crowdsourcing---and demonstrates how to advance learning this model, addressing critical aspects such as model identifiability, scalability, sample efficiency, and provable algorithms. The talk also covers how to leverage end-to-end deep learning techniques to enhance the robustness of the learning systems in adverse crowdsourcing scenarios including those with dependent annotators, no expert annotators, or annotators having data-dependent confusions. In essence, the talk offers an interesting exploration into finding synergy between classical factorization tools and deep learning methods in this domain.
For more info, please follow this link.
Read More