The following is an non-exhaustive list of research projects I am actively involved in.
Machine Learning and Data Mining
Matrix and Tensor Factorisation:
Many types of data can be represented as a matrix or a tensor (can be considered as a multi-dimensional matrix). Factorisation involves finding a low-dimensional representation that captures the intrinsic patterns/structure of the original data. In this research, we have solved graph clustering problems as constrained factorisations and contrast mining of tensors.
Multi-Clustering and Cluster Validation:
In a dataset, there are often more than one possible clustering. In this research, we have designed a new approach that uses ideas of filtering and ranking to improve the quality and diversity of final discovered clustering views.
Feature selection is mainly focused on selecting a informative subset of features, but much less work for unsupervised feature selection in data streams. In this research, we investigate new efficient algorithms for selecting features in data streams.
Graphs and Network Analysis
Graph clustering involves grouping vertices together to find common communities or the underlying structure in graphs. These can be used for marketing, abstraction to better understand large complex systems/organisation, protein functionality discovery etc. A well known clustering approach is community detection, where the groups have to many connections among themselves and few between groups. These groups can represent friendship groups and protein groups. However, community structure is not the only interesting structure in graphs. An example is the core-periphery structure, where a central, but small core of vertices are well connected to all other vertices and the other groups have fewer connections among themselves and to the other groups. This structure exists in most real graphs.
To find this type of structure requires a more general definition of a graph cluster. Blockmodelling is the technqiue to find these more general structures. In this research, we have devised a number of approaches to find more interpretable and overlapping memberrhip blockmodels.
Social Computing (Analysing social media and networks)
Planning a travel itinerary can be time consuming, frustrating and difficult to do well. Have to find and choose interesting places to visit, appropriate accommodation, plan and schedule transportation between locations, figuring out where to eat, and all constrained by time and monetary budgets. In this project, we look propose new recommendation based approaches that advances the ultimate aim of recommending personalised travel itineraries.
Users in online forums and other social media can be considered to take different roles. Using forums as an example, users can play the expert, enthusiast or the newbie roles (or a mixture of them) in a computer overclocking forum. In this research, we proposed a number of forum based features to group users into common roles. In addition, blcokmodelling (see above) can be used to find common user groups/roles in networks.
Cloud computing is one of the hot topics in academia and industry, due to its promise of ubiquitous computing. However, one of the major concerns of its usage is privacy and security – how to ensure your competitor doesn’t steal your company information if you share the same cloud computing host and hardware. One security concern is the co-residency attack. In this type of attack, an attack builds side channels into the target victim’s virtual machine and compromise the logical partitioning that is one of the main premise in cloud computing to guarantee privacy and security. Prior research try to prevent this type of attack via preventing side channels, but this requires fundamental changes to the cloud computing system, which is difficult and can be costly to a cloud provider in terms of efficiency and service guarantees. Hence, in this research, we proposed a number of innovations.
- We propose a novel approach to make it hard to co-locate virtual machines, hence making it difficult for an attacker to launch co-residency attacks.
- We propose a framework that tradesoff between power usage efficiency, job efficiency and the new factor, security. Traditionally operators have concentrated on the first two factors.
- We analyse and model virtual machine allocation traces, in order to understand normal allocation demands. This is then used in our new allocation approaches to prevent co-location.