בית הספר להנדסת חשמל ומחשבים
אירועים וסמינריםלפורטל הסטודנטיאלי

Full name: Itamar Elmakias Degree program: PhD Supervisor: Dr. Dan Vilenchik (cc to this email) Seminar title: Choosing the Right Feature Selection Algorithm: Dataset Hardness, Algorithm Cost, and Practical Guidelines

Full name: Itamar Elmakias Degree program: PhD Supervisor: Dr. Dan Vilenchik (cc to this email) Seminar title: Choosing the Right Feature Selection Algorithm: Dataset Hardness, Algorithm Cost, and Practical Guidelines Seminar abstract: Feature Selection (FS) is a central component in modern machine learning pipelines, particularly for high-dimensional classification tasks. While a large body of research proposes new FS algorithms, practitioners still lack clear guidance on when FS is truly beneficial, which algorithms to use, and how to balance performance gains against computational cost and stability. In the first part of this seminar, I will present findings from our recent work on dataset hardness characterization . Using a large-scale empirical study across many real-world datasets, we show that the effectiveness of FS strongly depends on intrinsic dataset properties, and that the common assumption that FS is universally beneficial does not hold. We introduce a practical taxonomy of datasets based on their response to FS, providing an empirical lens for understanding when FS is likely to help. In the second part, I will present ongoing work that shifts the focus from datasets to algorithms. We analyze FS algorithms through multiple operational dimensions, including runtime, stability across cross-validation folds, average and maximum performance gains, and sensitivity to the number of selected features. Based on these analyses, we propose a cost-aware, bucketed framework for FS algorithm selection, offering actionable guidelines for choosing an FS method under different time budgets and dataset regimes. Overall, the seminar aims to bridge the gap between theoretical FS research and practical decision-making, providing evidence-based heuristics for selecting FS algorithms rather than treating them as black-box components.
25 מאי 2026