About Me

I’m a Data Scientist and a Researcher. Currently I work as Senior Manager of Data & Machine Learning Science at Expedia Group, the leading online travel company. I also hold an Honorary Senior Research Fellow position at the University College London (UCL). Before that, I have been a Visiting Researcher at the University of Oxford, a Teaching Assistant at the University of Portsmouth, and a co-founder member of the Geoinformatics and Earth Observation Laboratory at the Penn State University. My studies include a PhD in Computational Intelligence and a BSc (Hons) in Computer Science. I have more than 15 peer-reviewed scientific publications in journals and international conferences. I am interested in Machine Learning (ML), with a particular focus on the Recommendation Systems and the Optimization fields. I specialize in building complex Distributed ML Frameworks & Pipelines, solving business needs at scale through custom ML solutions (from prototypes to fully fledged production-ready models). Moreover I have experience in working in the areas of Learn-to-Rank, Deep Learning and Time Series Forecasting.

Contact Details

Dr Alessio Petrozziello
London, UK.
-----------------------
alessio92p@gmail.com
apetrozziello@expediagroup.com
a.petrozziello@ucl.ac.uk

Education

Research Interests

Machine Learning for Big Data - Machine learning is ideal for exploiting the opportunities hidden in big data. It delivers on the promise of extracting value from big and disparate data sources with far less reliance on human direction. It is data driven and runs at machine scale. It is well suited to the complexity of dealing with disparate data sources and the huge variety of variables and amounts of data involved - and unlike traditional analysis, machine learning thrives on growing datasets, meaning the more data fed into a machine learning system, the more it can learn and apply the results to higher quality insights. Freed from the limitations of human scale thinking and analysis, machine learning is able to discover and display the patterns buried in the data.

Missing Data Imputation - Dealing with missing data is an important step in dataset pre-processing since most statistical analysis techniques, data reduction tools, machine learning methods and recommender systems require complete datasets. There are many techniques that can be used to deal with the missingness, but the common approach is to make the most of the available data through minimizing the loss of statistical power and the bias inevitably brought by inferring values for the missing data. A more interesting problem arise when the size of the data grows in a way that the state-of-the-art techniques are no longer applicable due to the high computational time and memory required - here new distributed and online techniques should be studied and developed in order to face this new challenge.

Software Development Effort Estimation - Development effort is considered the dominant cost of software projects, thus effort estimation is a critical activity for planning and monitoring software project development and for delivering the product on time and within budget. Significant over or under-estimates can be very expensive for a company and the competitiveness of a software company heavily depends on the ability of its project managers to accurately predict in advance the effort required to develop software systems. In the literature several methods have been proposed in order to estimate software development effort. Among them, widely employed estimation methods try to explain the effort to develop a software system in terms of some relevant factors (named cost drivers), e.g., Linear and Stepwise Regression, Regression Tree, and Case Based-Reasoning. These methods exploit data from past projects, consisting of both factor values that are related to effort and the actual effort to develop the projects, in order to estimate the effort for a new project under development. The main research topics related to the software development effort estimation regard the definition and empirical evaluation of search-based approach for building novel estimation models and the definition and the empirical evaluation of functional metrics for sizing software products.

University

PhD in Computational Intelligence: Supervised Machine Learning Methods for Complex Data University of Portsmouth - 2019

Bachelor Degree in Computer Science: 1st with Honours University of Salerno - 2014

Courses and Schools attended

Improving Applicability of Nature-Inspired Optimisation Paris - Oct 18-24 2017

International School on Mathematics “Guido Stampacchia” Workshop “Graph Theory, Algorithms and Application (3rd Edition) Erice - Sep 8-16 2014

International Summer School on Software Engineering (11th edition) University of Salerno - July 2014

Data Mining of satellite data for the study of natural hazards University of Salerno - September 2013

Work Experience

Companies

Senior Manager of Data & Machine Learning Science Expedia Group - Mar 2021 - Present

Data Science Team Lead, Machine Learning Expedia Group - Sep 2020 - Mar 2021

Senior Data Scientist, Machine Learning Expedia Group - Nov 2019 - Sep 2020

Data Scientist, Machine Learning Expedia Group - Sep 2017 - Nov 2019

Data Scientist Intern, Machine Learning Expedia Group - Jun 2017 - Sep 2017

Data Scientist Intern, Machine Learning Expedia Group - Jun 2016 - Oct 2016

Research

Honorary Senior Research Fellow University College London (UCL) - Nov 2019 - Present

Visiting Researcher at the University of Oxford (Big Data Institute) under the supervision of Dr Antoniya Georgieva. University of Oxford - Jan 2018 - Dec 2018

Erasmus+ Traineeship at the University of Salerno (DISA-MIS) under the supervision of Dr Roberto Tagliaferri. University of Salerno - Mar 2017 - Apr 2017

Erasmus+ Traineeship at the University of Portsmouth (School of Computing) under the supervision of Dr Ivan Jordanov. University of Portsmouth - Mar 2015 - Jul 2015

Research Assistant at the Penn State University (Department of GeoInformatics) under the supervision of Dr Guido Cervone. Penn State University - Mar 2014 - May 2014

Publications

International Conferences

  1. A. Petrozziello and C. Sommeregger, "Search Ranking At Hotels.com". (Under Review)

  2. F. Sarro, A. Petrozziello, D.-Q. He, S. Yoo "A New Approach to Distribute MOEA Pareto Front Computation", in Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO’20 Companion). pdf paper

  3. A. Petrozziello and I. Jordanov, "Automated Deep Learning for Threat Detection in Luggage from X-ray Images". Special Event on Analysis of Experimental Algorithms (SEA 2019).pdf paper

  4. A. Petrozziello and I. Jordanov, "Feature Based Multivariate Data Imputation". The 4th Annual Conference on machine Learning, Optimization and Data science (LOD 2018).pdf paper

  5. A. Petrozziello, I. Jordanov, A.T. Papageorghiou, C.W.G. Redman, and A. Georgieva, "Deep Learning for Continuous Electronic Fetal Monitoring in Labor", IEEE 40th International Engineering in Medicine and Biology Conference (EMBC 2018).pdf paper

  6. A. Petrozziello, C. Sommeregger and I. Jordanov, "Distributed Neural Networks for Missing Big Data Imputation", IEEE International Joint Conference on Neural Networks (IJCNN 2018).pdf paper

  7. A. Petrozziello and I. Jordanov, "Column-wise Guided Data Imputation", 17th International Conference on Computational Science (ICCS 2017). pdf paper

  8. A. Petrozziello and I. Jordanov "Data Analytics for Online Traveling Recommendation System: A Case Study", IASTED's 36th International Conference on Modelling, Identification and Control (MIC 2017).pdf paper

  9. I. Jordanov, N. Petrov and A. Petrozziello, "Supervised Radar Signal Classification", IEEE International Joint Conference on Neural Networks (IJCNN 2016).pdf paper

  10. F. Sarro, A. Petrozziello and M. Harman, "Multi-Objective Effort Estimation", ACM 38th International Conference on Software Engineering (ICSE 2016).pdf paper Supplementary materials

International Journals

  1. A. Petrozziello, A. Serra, L. Troiano, M. La Rocca, G. Storti, I. Jordanov and R. Tagliaferri, "Deep Learning for Volatility Forecasting in Asset Management", Soft Computing (2022).pdf paper

  2. A. Petrozziello, X. Liu and C. Sommeregger "A Scale Invariant Ranking Function for Learning-to-Rank: A Real-World Use Case", in arXiv preprint (2021) pdf paper

  3. V. Tawosi, F. Sarro, A. Petrozziello and M. Harman, "Multi-Objective Software Effort Estimation: A Replication Study", IEEE Transactions on Software Engineering (2021).pdf paper

  4. F. Sarro, R. Moussa, A. Petrozziello and M. Harman, "Learning From Mistakes: Machine Learning Enhanced Human Expert Effort Estimates", IEEE Transactions on Software Engineering (2020).pdf paper

  5. A. Petrozziello, I. Jordanov, A.T. Papageorghiou, C.W.G. Redman, and A. Georgieva, "Multimodal Convolutional Neural Networks to detect fetal compromise during labor and delivery", IEEE Access (2019).pdf paper

  6. A. Petrozziello and F. Sarro, "Linear Programming as a Baseline for Software Effort Estimation", ACM Transactions on Software Engineering and Methodology (2018).pdf paper

  7. I. Jordanov, N. Petrov and A. Petrozziello, "Classifiers accuracy improvement based on missing data imputation", Journal of Artificial Intelligence and Soft Computing Research (2018).pdf paper

  8. A. Petrozziello, G. Cervone, P. Franzese, S.E. Haupt, R. Cerulli, "Source Reconstruction of Atmospheric Releases with Limited Meteorological Observations Using Genetic Algorithms", Applied Artificial Intelligence Journal (2017).pdf paper

The copyright of the papers is owned by the respective publishers. Personal use of the electronic versions here provided is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the publishers.

PhD Thesis

  1. A. Petrozziello, "Supervised Machine Learning Methods for Complex Data", PhD Thesis, University of Portsmouth (UK), June 2019.pdf paper

Fundings

Scholarships

  1. Erasmus+ Traineeship Scholarship (960€) University of Portsmouth - April 2017

  2. 3 years full time PhD Bursary (£42000) University of Portsmouth - September 2015

  3. Erasmus+ Traineeship Scholarship (2400€) University of Salerno - February 2015

  4. Scholarship "Messaggeri della Conoscenza) (12000€) MIUR (Italian Ministry of Education University and Research) - October 2013

Grants

  1. The COST Action CA15140 'Improving Applicability of Nature-Inspired Optimisation' (925€) COST Action - October 2019

  2. The COST Action CA15140 'Improving Applicability of Nature-Inspired Optimisation' (1420€) COST Action - October 2017

  3. AWS Credits for Research Award ($9000) Amazon Web Services - August 2017

  4. Azure for Research Award: Data Science ($20000) Microsoft - April 2017

  5. Azure for Research Award: Machine Learning ($20000) Microsoft - May 2016

  6. Azure for Research Award: Machine Learning ($20000) Microsoft - May 2015

  7. ISSSE2014 Travel Grant University College of London (UK) - June 2014

Awards

  1. Intern price Award - Summer Hackathon 2016 on Smart Searches Hotels.com - September 2016

  2. Bronze Medal - 13th Annual "Humies" Awards for Human-Competitive Results Produced by Genetic and Evolutionary Computation ($2000) Genetic and Evolutionary Computation Conference (GECCO16) - July 2016

  3. Accenture Talent Digital Competition - Finalist Accenture - June 2015

  4. The young scientist award ($1000) The Italian Cultural Society of Washington D.C., inc. - June 2014

Teaching

Advising

Currently (co)advising PhD Students

  1. Two PhD candidates co-supervised at University College London (UK) with Dr Federica Sarro.

Formerly (co)advised Bachelor and Master Students

  1. One Master thesis co-supervised at University College London (UK) with Dr Federica Sarro.

  2. Two Erasmus students (master thesis) co-supervised at University College London (UK), with Dr Federica Sarro (University College London, UK) and Filomena Ferrucci (University of Salerno, Italy).

Teaching Assistant

Academic Year 2017/2018

  1. First term: Artificial Neural Networks and Genetic Algorithms (NENGA), BSc year 3, Dr Ivan Jordanov, School of Computing, University of Portsmouth, Portsmouth (UK).

  2. First term: Advanced Programming Concepts (ADPROC), BSc year 2, Dr Ivan Jordanov, School of Computing, University of Portsmouth, Portsmouth (UK).

Academic Year 2016/2017

  1. Second term: Artificial Neural Networks and Genetic Algorithms (NENGA), BSc year 3, Dr Ivan Jordanov, School of Computing, University of Portsmouth, Portsmouth (UK).

  2. First term: Advanced Programming Concepts (ADPROC), BSc year 2, Dr Ivan Jordanov, School of Computing, University of Portsmouth, Portsmouth (UK).

Academic Year 2015/2016

  1. Second term: Artificial Neural Networks and Genetic Algorithms (NENGA), BSc year 3, Dr Ivan Jordanov, School of Computing, University of Portsmouth, Portsmouth (UK).

  2. First term: Advanced Programming Concepts (ADPROC), BSc year 2, Dr Ivan Jordanov, School of Computing, University of Portsmouth, Portsmouth (UK).

Professional Services

Reviewer for International Conferences

  1. The Genetic and Evolutionary Computation Conference (GECCO): 2018

  2. Symposium on Search-Based Software Engineering (SSBSE): 2017

  3. International Conference on Software Maintenance and Evolution (ICSME): 2017

Reviewer for International Journals

  1. Empirical Software Engineering: 2018, 2019

  2. IEEE Transactions on Software Engineering: 2017

  3. IET Software: 2017

  4. Information and Software Technology: 2017

  5. Journal of Hazardous Materials: 2017

Attended Conferences

  1. SPARK+AI Summit Europe, October 2019

  2. SPARK+AI Summit Europe, October 2018

  3. AI Congress and Data Science Summit, September 2018

  4. IEEE International Joint Conference on Neural Networks (IJCNN 2018), July 2018

  5. 7th International Conference on Computational Science, (ICCS 2017), June 2017

Membership

Active

  1. Member of the Software Optimisation, Learning and Analytics Research (SOLAR) group University College London (UCL) - March 2020 - Now

  2. Visiting Researcher at the Big Data Institute (BDI) University of Oxford - January 2018 - December 2018

  3. Collaborator at the DYNAMIC ADAPTIVE AUTOMATED SOFTWARE ENGINEERING (DAASE) - CREST CENTRE University College London (UCL) - October 2017 - March 2020

  4. Researcher University of Portsmouth - October 2015 - June 2019

  5. Co-Founding member of the “Geoinformatics and Earth Observation Laboratory” Penn State University - February 2014

  • "Random processes play an important role in evolution and, to some extent, in all things."I. Asimov"

    I. Asimov