Hotel2Vec: Learning Hotel Embeddings from User Click Sessions with Side Information
Abstract: We propose a new neural network architecture for learning vector representations of items with attributes, specifically hotels. Unlike previous works, which typically only rely on modeling of user-item interactions for learning item embeddings, we propose a framework that combines several sources of data, including user clicks, hotel attributes (e.g., property type, star rating, average user rating), amenity information (e.g., if the hotel has free Wi-Fi or free breakfast), and geographic information that leverages an hexagonal geospatial system as well as spatial encoders. During model training, a joint embedding is learned from all of the above information. We show that including structured attributes about hotels enables us to make better predictions in a downstream task than when we rely exclusively on click data. We train our embedding model on more than 60 million user click sessions from a leading online travel platform, and learn embeddings for more than one million hotels. Our final learned embeddings integrate distinct sub-embeddings for user clicks, hotel attributes, and geographic information, providing a representation that can be used flexibly depending on the application.
An important advantage of the proposed neural model is that it addresses the cold-start problem for hotels with insufficient historical click information by incorporating additional hotel attributes, which are available for all hotels.
We show through the results of an online A/B test that our model generates high-quality representations that boost the performance of a hotel recommendation system on a large online travel platform.
Authors: Ioannis Partalas, Anne Morvan, Ali Sadeghian, Shervin Minaee, Xinxin Li, Brooke Cowan, Daisy Zhe Wang. Proceedings of the Workshop on (Recommenders in Tourism) co-located with the 15th ACM Conference on Recommender Systems (RecSys 2021), 2974, pp.69-84, Online, 26 Sep.
[PDF]
Structured adaptive and random spinners for fast machine learning computations
Abstract: We consider an efficient computational framework for speeding up several machine learning algorithms with almost no loss of accuracy. The proposed framework relies on projections via structured matrices that we call Structured Spinners, which are formed as products of three structured matrix-blocks that incorporate rotations. The approach is highly generic, i.e. i) structured matrices under consideration can either be fully-randomized or learned, ii) our structured family contains as special cases all previously considered structured schemes, iii) the setting extends to the non-linear case where the projections are followed by non-linear functions, and iv) the method finds numerous applications including kernel approximations via random feature maps, dimensionality reduction algorithms,new fast cross-polytope LSH techniques, deep learning, convex optimization algorithms via Newton sketches, quantization with random projection trees, and more. The proposed framework comes with theoretical guarantees characterizing the capacity of the structured model in reference to its unstructured counterpart and is based on a general theoretical principle that we describe in the paper. As a consequence of our theoretical analysis, we provide the first theoretical guarantees for one of the most efficient existing LSH algorithms based on the $HD_3 HD_2 HD_1$ structured matrix [Andoni et al., 2015]. The exhaustive experimental evaluation confirms the accuracy and efficiency of structured spinners for a variety of different applications.
Authors: Mariusz Bojarski, Anna Choromanska, Krzysztof Choromanski, Francois Fagan, Cédric Gouy-Pailler, Anne Morvan, Nouri Sakr, Tamas Sarlos, Jamal Atif. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS'17), 54, pp.1020-1029, Fort Lauderdale, FL, USA, 20-22 Apr.
I gave an oral presentation on preliminary work of this paper at Google Research NY seminar on July, 14th 2016. I also presented the final work at Expedia Geneva on April, 6th 2018.
[PDF]
[Code]
[Google seminar slides]
[AISTATS 2017 Poster]
[Expedia seminar slides]
Streaming Binary Sketching based on Subspace Tracking and Diagonal Uniformization
Abstract: In this paper, we address the problem of learning compact similarity-preserving embeddings for massive high-dimensional streams of data in order to perform efficient similarity search. We present a new online method for computing binary compressed representations -sketches- of high-dimensional real feature vectors. Given an expected code length $c$ and high-dimensional input data points, our algorithm provides a $c$-bits binary code for preserving the distance between the points from the original high-dimensional space. Our algorithm does not require neither the storage of the whole dataset nor a chunk, thus it is fully adaptable to the streaming setting. It also provides low time complexity and convergence guarantees.
We demonstrate the quality of our binary sketches through experiments on real data for the nearest neighbors search task in the online setting.
Authors: Anne Morvan, Antoine Souloumiac, Cédric Gouy-Pailler, Jamal Atif. Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2018), Calgary, Alberta, Canada, 15-20 Apr.
I presented this work at ICASSP 2018 (Calgary, Canada) during poster session on Tuesday, April 17 2018.
[PDF]
[Code]
[ICASSP 2018 Poster]
On the Needs for Rotations in Hypercubic Quantization Hashing
Abstract: The aim of this paper is to endow the well-known family of hypercubic quantization hashing methods with theoretical guarantees.
In hypercubic quantization, applying a suitable (random or learned) rotation after dimensionality reduction has been experimentally shown to improve the results accuracy in the nearest neighbors search problem.
We prove in this paper that the use of these rotations is optimal under some mild assumptions: getting optimal binary sketches is equivalent to applying a rotation uniformizing the diagonal of the covariance matrix between data points. Moreover, for two closed points, the probability to have dissimilar binary sketches is upper bounded by a factor of the initial distance between the data points. Relaxing these assumptions, we obtain a general concentration result for random matrices.
We also provide some experiments illustrating these theoretical points and compare a set of algorithms in both the batch and online settings.
Authors: Anne Morvan, Antoine Souloumiac, Krzysztof Choromanski, Cédric Gouy-Pailler, Jamal Atif. 2018
[PDF]
[Code]
Graph sketching-based Space-efficient Data Clustering
Abstract: In this paper, we address the problem of recovering arbitrary-shaped data clusters from datasets while facing high space constraints, as this is for instance the case in many real-world applications when analysis algorithms are directly deployed on resources-limited mobile devices collecting the data.
We present DBMSTClu a new space-efficient density-based non-parametric method working on a Minimum Spanning Tree (MST) recovered from a limited number of linear measurements
i.e. a sketched version of the dissimilarity graph $\mathcal{G}$ between the $N$ objects to cluster. Unlike $k$-means, $k$-medians or $k$-medoids algorithms, it does not fail at distinguishing clusters with particular forms thanks to the property of the MST for expressing the underlying structure of a graph. No input parameter is needed contrarily to DBSCAN or the Spectral Clustering method. An approximate MST is retrieved by following the dynamic semi-streaming model in handling the dissimilarity graph $\mathcal{G}$ as a stream of edge weight updates which is sketched in one pass over the data into a compact structure requiring $O(N \operatorname{polylog}(N))$ space, far better than the theoretical memory cost $O(N^2)$ of $\mathcal{G}$. The recovered approximate MST $\mathcal{T}$ as input, DBMSTClu then successfully detects the right number of nonconvex clusters by performing relevant cuts on $\mathcal{T}$ in a time linear in $N$. We provide theoretical guarantees on the quality of the clustering partition and also demonstrate its advantage over the existing state-of-the-art on several datasets.
Authors: Anne Morvan, Krzysztof Choromanski, Cédric Gouy-Pailler, Jamal Atif. Proceedings of the 2018 SIAM International Conference on Data Mining (SDM'18), pp.10-18, San Diego, CA, USA, 3-4 May.
I presented preliminary work of this paper during the poster session at the DS3 Data Science Summer School of Ecole Polytechnique on August, 30th 2017.
This paper I presented this work at SDM'18 (San Diego, USA) during oral and poster sessions on Thursday, May 3 2018.
[PDF]
[Code]
[DS3 Poster]
[SDM'18 Poster]
[SDM'18 Slides]
Graph-based Clustering under Differential Privacy
Abstract: In this paper, we present the first differentially private clustering method for arbitrary-shaped node clusters in a graph. This algorithm takes as input only an approximate Minimum Spanning Tree (MST) $\mathcal{T}$ released under weight differential privacy constraints from the graph. Then, the underlying nonconvex clustering partition is successfully recovered from cutting optimal cuts on $\mathcal{T}$. As opposed to existing methods, our algorithm is theoretically well-motivated. Experiments support our theoretical findings.
Authors: Rafael Pinot, Anne Morvan, Florian Yger, Cédric Gouy-Pailler, Jamal Atif. Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI 2018), Monterey, CA, USA, 6-10 Aug.
[PDF]
[Code DBMSTClu]
[Code PAMST]