I’m a Data Scientist and a Researcher. Currently I work as Senior Manager of Data & Machine Learning Science at Expedia Group, the leading online travel company. I also hold an Honorary Senior Research Fellow position at the University College London (UCL). Before that, I have been a Visiting Researcher at the University of Oxford, a Teaching Assistant at the University of Portsmouth, and a co-founder member of the Geoinformatics and Earth Observation Laboratory at the Penn State University. My studies include a PhD in Computational Intelligence and a BSc (Hons) in Computer Science. I have more than 15 peer-reviewed scientific publications in journals and international conferences. I am interested in Machine Learning (ML), with a particular focus on the Recommendation Systems and the Optimization fields. I specialize in building complex Distributed ML Frameworks & Pipelines, solving business needs at scale through custom ML solutions (from prototypes to fully fledged production-ready models). Moreover I have experience in working in the areas of Learn-to-Rank, Deep Learning and Time Series Forecasting.
Dr Alessio Petrozziello
London, UK.
-----------------------
alessio92p@gmail.com
apetrozziello@expediagroup.com
a.petrozziello@ucl.ac.uk
Machine Learning for Big Data - Machine learning is ideal for exploiting the opportunities hidden in big data. It delivers on the promise of extracting value from big and disparate data sources with far less reliance on human direction. It is data driven and runs at machine scale. It is well suited to the complexity of dealing with disparate data sources and the huge variety of variables and amounts of data involved - and unlike traditional analysis, machine learning thrives on growing datasets, meaning the more data fed into a machine learning system, the more it can learn and apply the results to higher quality insights. Freed from the limitations of human scale thinking and analysis, machine learning is able to discover and display the patterns buried in the data.
Missing Data Imputation - Dealing with missing data is an important step in dataset pre-processing since most statistical analysis techniques, data reduction tools, machine learning methods and recommender systems require complete datasets. There are many techniques that can be used to deal with the missingness, but the common approach is to make the most of the available data through minimizing the loss of statistical power and the bias inevitably brought by inferring values for the missing data. A more interesting problem arise when the size of the data grows in a way that the state-of-the-art techniques are no longer applicable due to the high computational time and memory required - here new distributed and online techniques should be studied and developed in order to face this new challenge.
Software Development Effort Estimation - Development effort is considered the dominant cost of software projects, thus effort estimation is a critical activity for planning and monitoring software project development and for delivering the product on time and within budget. Significant over or under-estimates can be very expensive for a company and the competitiveness of a software company heavily depends on the ability of its project managers to accurately predict in advance the effort required to develop software systems. In the literature several methods have been proposed in order to estimate software development effort. Among them, widely employed estimation methods try to explain the effort to develop a software system in terms of some relevant factors (named cost drivers), e.g., Linear and Stepwise Regression, Regression Tree, and Case Based-Reasoning. These methods exploit data from past projects, consisting of both factor values that are related to effort and the actual effort to develop the projects, in order to estimate the effort for a new project under development. The main research topics related to the software development effort estimation regard the definition and empirical evaluation of search-based approach for building novel estimation models and the definition and the empirical evaluation of functional metrics for sizing software products.
PhD in Computational Intelligence: Supervised Machine Learning Methods for Complex Data • University of Portsmouth - 2019
Bachelor Degree in Computer Science: 1st with Honours • University of Salerno - 2014
Improving Applicability of Nature-Inspired Optimisation • Paris - Oct 18-24 2017
International School on Mathematics “Guido Stampacchia” Workshop “Graph Theory, Algorithms and Application (3rd Edition) • Erice - Sep 8-16 2014
International Summer School on Software Engineering (11th edition) • University of Salerno - July 2014
Data Mining of satellite data for the study of natural hazards • University of Salerno - September 2013
Senior Manager of Data & Machine Learning Science • Expedia Group - Mar 2021 - Present
Data Science Team Lead, Machine Learning • Expedia Group - Sep 2020 - Mar 2021
Senior Data Scientist, Machine Learning • Expedia Group - Nov 2019 - Sep 2020
Data Scientist, Machine Learning • Expedia Group - Sep 2017 - Nov 2019
Data Scientist Intern, Machine Learning • Expedia Group - Jun 2017 - Sep 2017
Data Scientist Intern, Machine Learning • Expedia Group - Jun 2016 - Oct 2016
Honorary Senior Research Fellow • University College London (UCL) - Nov 2019 - Present
Visiting Researcher at the University of Oxford (Big Data Institute) under the supervision of Dr Antoniya Georgieva. • University of Oxford - Jan 2018 - Dec 2018
Erasmus+ Traineeship at the University of Salerno (DISA-MIS) under the supervision of Dr Roberto Tagliaferri. • University of Salerno - Mar 2017 - Apr 2017
Erasmus+ Traineeship at the University of Portsmouth (School of Computing) under the supervision of Dr Ivan Jordanov. • University of Portsmouth - Mar 2015 - Jul 2015
Research Assistant at the Penn State University (Department of GeoInformatics) under the supervision of Dr Guido Cervone. • Penn State University - Mar 2014 - May 2014
A. Petrozziello and C. Sommeregger, "Search Ranking At Hotels.com". (Under Review)
F. Sarro, A. Petrozziello, D.-Q. He, S. Yoo "A New Approach to Distribute MOEA Pareto Front Computation", in Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO’20 Companion).
A. Petrozziello and I. Jordanov, "Automated Deep Learning for Threat Detection in Luggage from X-ray Images". Special Event on Analysis of Experimental Algorithms (SEA 2019).
A. Petrozziello and I. Jordanov, "Feature Based Multivariate Data Imputation". The 4th Annual Conference on machine Learning, Optimization and Data science (LOD 2018).
A. Petrozziello, I. Jordanov, A.T. Papageorghiou, C.W.G. Redman, and A. Georgieva, "Deep Learning for Continuous Electronic Fetal Monitoring in Labor", IEEE 40th International Engineering in Medicine and Biology Conference (EMBC 2018).
A. Petrozziello, C. Sommeregger and I. Jordanov, "Distributed Neural Networks for Missing Big Data Imputation", IEEE International Joint Conference on Neural Networks (IJCNN 2018).
A. Petrozziello and I. Jordanov, "Column-wise Guided Data Imputation", 17th International Conference on Computational Science (ICCS 2017).
A. Petrozziello and I. Jordanov "Data Analytics for Online Traveling Recommendation System: A Case Study", IASTED's 36th International Conference on Modelling, Identification and Control (MIC 2017).
I. Jordanov, N. Petrov and A. Petrozziello, "Supervised Radar Signal Classification", IEEE International Joint Conference on Neural Networks (IJCNN 2016).
F. Sarro, A. Petrozziello and M. Harman, "Multi-Objective Effort Estimation", ACM 38th International Conference on Software Engineering (ICSE 2016). Supplementary materials
A. Petrozziello, A. Serra, L. Troiano, M. La Rocca, G. Storti, I. Jordanov and R. Tagliaferri, "Deep Learning for Volatility Forecasting in Asset Management", Soft Computing (2022).
A. Petrozziello, X. Liu and C. Sommeregger "A Scale Invariant Ranking Function for Learning-to-Rank: A Real-World Use Case", in arXiv preprint (2021)
V. Tawosi, F. Sarro, A. Petrozziello and M. Harman, "Multi-Objective Software Effort Estimation: A Replication Study", IEEE Transactions on Software Engineering (2021).
F. Sarro, R. Moussa, A. Petrozziello and M. Harman, "Learning From Mistakes: Machine Learning Enhanced Human Expert Effort Estimates", IEEE Transactions on Software Engineering (2020).
A. Petrozziello, I. Jordanov, A.T. Papageorghiou, C.W.G. Redman, and A. Georgieva, "Multimodal Convolutional Neural Networks to detect fetal compromise during labor and delivery", IEEE Access (2019).
A. Petrozziello and F. Sarro, "Linear Programming as a Baseline for Software Effort Estimation", ACM Transactions on Software Engineering and Methodology (2018).
I. Jordanov, N. Petrov and A. Petrozziello, "Classifiers accuracy improvement based on missing data imputation", Journal of Artificial Intelligence and Soft Computing Research (2018).
A. Petrozziello, G. Cervone, P. Franzese, S.E. Haupt, R. Cerulli, "Source Reconstruction of Atmospheric Releases with Limited Meteorological Observations Using Genetic Algorithms", Applied Artificial Intelligence Journal (2017).
Erasmus+ Traineeship Scholarship (960€) • University of Portsmouth - April 2017
3 years full time PhD Bursary (£42000) • University of Portsmouth - September 2015
Erasmus+ Traineeship Scholarship (2400€) • University of Salerno - February 2015
Scholarship "Messaggeri della Conoscenza) (12000€)• MIUR (Italian Ministry of Education University and Research) - October 2013
The COST Action CA15140 'Improving Applicability of Nature-Inspired Optimisation' (925€) • COST Action - October 2019
The COST Action CA15140 'Improving Applicability of Nature-Inspired Optimisation' (1420€) • COST Action - October 2017
AWS Credits for Research Award ($9000) • Amazon Web Services - August 2017
Azure for Research Award: Data Science ($20000) • Microsoft - April 2017
Azure for Research Award: Machine Learning ($20000) • Microsoft - May 2016
Azure for Research Award: Machine Learning ($20000) • Microsoft - May 2015
ISSSE2014 Travel Grant• University College of London (UK) - June 2014
Intern price Award - Summer Hackathon 2016 on Smart Searches • Hotels.com - September 2016
Bronze Medal - 13th Annual "Humies" Awards for Human-Competitive Results Produced by Genetic and Evolutionary Computation ($2000) • Genetic and Evolutionary Computation Conference (GECCO16) - July 2016
Accenture Talent Digital Competition - Finalist • Accenture - June 2015
The young scientist award ($1000)• The Italian Cultural Society of Washington D.C., inc. - June 2014
Currently (co)advising PhD Students
Two PhD candidates co-supervised at University College London (UK) with Dr Federica Sarro.
Formerly (co)advised Bachelor and Master Students
One Master thesis co-supervised at University College London (UK) with Dr Federica Sarro.
Two Erasmus students (master thesis) co-supervised at University College London (UK), with Dr Federica Sarro (University College London, UK) and Filomena Ferrucci (University of Salerno, Italy).
Academic Year 2017/2018
First term: Artificial Neural Networks and Genetic Algorithms (NENGA), BSc year 3, Dr Ivan Jordanov, School of Computing, University of Portsmouth, Portsmouth (UK).
First term: Advanced Programming Concepts (ADPROC), BSc year 2, Dr Ivan Jordanov, School of Computing, University of Portsmouth, Portsmouth (UK).
Academic Year 2016/2017
Second term: Artificial Neural Networks and Genetic Algorithms (NENGA), BSc year 3, Dr Ivan Jordanov, School of Computing, University of Portsmouth, Portsmouth (UK).
First term: Advanced Programming Concepts (ADPROC), BSc year 2, Dr Ivan Jordanov, School of Computing, University of Portsmouth, Portsmouth (UK).
Academic Year 2015/2016
Second term: Artificial Neural Networks and Genetic Algorithms (NENGA), BSc year 3, Dr Ivan Jordanov, School of Computing, University of Portsmouth, Portsmouth (UK).
First term: Advanced Programming Concepts (ADPROC), BSc year 2, Dr Ivan Jordanov, School of Computing, University of Portsmouth, Portsmouth (UK).
The Genetic and Evolutionary Computation Conference (GECCO): 2018
Symposium on Search-Based Software Engineering (SSBSE): 2017
International Conference on Software Maintenance and Evolution (ICSME): 2017
Empirical Software Engineering: 2018, 2019
IEEE Transactions on Software Engineering: 2017
IET Software: 2017
Information and Software Technology: 2017
Journal of Hazardous Materials: 2017
SPARK+AI Summit Europe, October 2019
SPARK+AI Summit Europe, October 2018
AI Congress and Data Science Summit, September 2018
IEEE International Joint Conference on Neural Networks (IJCNN 2018), July 2018
7th International Conference on Computational Science, (ICCS 2017), June 2017
Member of the Software Optimisation, Learning and Analytics Research (SOLAR) group • University College London (UCL) - March 2020 - Now
Visiting Researcher at the Big Data Institute (BDI) • University of Oxford - January 2018 - December 2018
Collaborator at the DYNAMIC ADAPTIVE AUTOMATED SOFTWARE ENGINEERING (DAASE) - CREST CENTRE • University College London (UCL) - October 2017 - March 2020
Researcher • University of Portsmouth - October 2015 - June 2019
Co-Founding member of the “Geoinformatics and Earth Observation Laboratory” • Penn State University - February 2014