Analysis of dynamic network data Most works in network analysis concentrate on static graphs and find patterns such as the most in influential nodes in the network. Very few existing methods are able to deal with repeated interactions between nodes in a network. The main goal of the research in this topic is hence to fill this gap by developing methods to identify patterns in interactions between network nodes. We studied so-called information channels that indicate information flows. Process Mining In process mining the object of study are logs generated by business processes. Consider for instance a log generated by a leave request system, recording activities such as users logging in, opening a new request, managers approving requests, emails being sent by the system, etc. In process mining such logs are analyzed to better understand, monitor, and improve the business processes. One tasks in this context is detecting complex events. Complex events can be used to find pre-defined security problems or abnormalities. Often, however, anomalies may occur that are not foreseen in the systems. In order to be able to handle such cases, anomaly detection techniques are necessary. With the following work on model-based anomaly detection using dynamic Bayesian networks, we won the Business Process Intelligence challenge at the BPM 2018 conference: S. Pauwels and T. Calders. Detecting and Explaining Drifts in Yearly Grant Applications. In BPM Workshop Business Process Intelligence (BPI), 2018. Fairness-Aware Machine Learning In contemporary society we are continuously being profiled; banks have profiles to divide up people according to credit risk, insurance companies profile clients for accident risk, telephone companies profile users on their calling behavior, web corporations profile users according to their interests and preferences based on web activity and visitation patterns. These profiles are more and more built automatically by machine learning methods trained on historical data. Within society there are growing concerns that these machine learning methods do not have ethical or moral restrictions. Recent studies show indeed that in circumstances where historical data is biased, or when there is omitted variable bias, automatically learned methods may take decisions that could be considered discriminatory. Apart from ethical considerations, there are also legal restrictions to the use of profiling methods that blindly optimize accuracy without taking unwanted discriminatory effects into account. The recent General Data Protection Regulation (GDPR; Regulation (EU) 2016/679) explicitly mentions profiling (Art. 22 GDPR Automated individual decision-making, including profiling) as an activity in which decisions should not be based on personal data and suitable measures should be in place to safeguard the data subjects rights and freedoms and legitimate interests. Most profiling techniques, however, do not consider anti-discrimination legislation and may unintentionally produce models that are unfair and hence do not safeguard the data subjects freedoms. A further complication is that often detecting whether a model is unfair, is highly non-trivial.
AbstractDigital transformation has caused changes in all aspects of human life. In the DigiTax project, we look at the tax implications of this process from two perspectives. First, we examine the challenges that digitalisation brings to the tax area. For example, in the digital economy multinationals have more opportunities to shift profits to low-tax countries. Where should these profits be taxed? Also, increasingly robots are entering the labor force market, from automated driving cars to chatbots. Should they be considered a separate taxable entity, and if so, how do these robots need to be taxed? More generally, we will look at: (a) which tax regimes come under pressure, (b) is there a need to change the traditional tax concepts and if so which new tax concepts can be developed to contribute to a fairer taxation, (c) who is legitimately authorized, and (d) how to implement the change? Second, and vice versa, we study the opportunities that digitalisation creates for the fairness of taxation and the efficiency and effectiveness of the tax authorities. For example, how can improved data mining algorithms or the inclusion of novel data sources help to develop more accurate, understandable and discrimination-free fraud detection systems that minimize tax non-compliance or tax-evasion? Or how can blockchain technology improve transparency, tax compliance and trust between authorities and taxpayers? We will specifically look at the opportunities that data mining, internet of things (IoT) and blockchain technology bring to the tax domain. This project explicitly calls for a multidisciplinary approach, studying the technological, legal, economical and societal implications of digitalisation and tax.
- Promotor: Peeters Bruno
- Co-promotor: Calders Toon
- Co-promotor: Jorissen Ann
- Co-promotor: Martens David
- Co-promotor: Van de Vijver Anne
AbstractMost works in network analysis concentrate on static graphs and find patterns such as which are the most influential nodes in the network. Very few existing methods are able to deal with repeated interactions between nodes in a network. The main goal of the project is hence to fill this gap by developing methods to identify patterns in interactions between network nodes. These interaction patterns could characterize information propagation in social networks, or money streams in financial transaction networks. We consider three orthogonal dimensions. The first one is the pattern type. We consider, among others, temporal paths, information cascade trees and cycles. To guide our choice of which patterns to study, we get inspiration from three real-world cases: two interaction networks with payment data, one for which the task is marketing related, and one for default prediction, and one social network with an application in microfinance. The second dimension is how to use the query pattern: exhaustively find all occurrences of the patterns, or as a participation query that finds nodes that participate more often in a pattern of interest. Finally, the third dimension concerns the computational model: offline, one-pass, or streaming. It is important to scale up to large interaction networks. In summary, the novelty of our proposal lies in the combination of streaming techniques, pattern mining, and social network analysis, validated on three real-world cases.
AbstractThe Flemish AI research program aims to stimulate strategic basic research focusing on AI at the different Flemish universities and knowledge institutes. This research must be applicable and relevant for the Flemish industry. Concretely, 4 grand challenges 1. Help to make complex decisions: focusses on the complex decision-making despite the potential presence of wrongful or missing information in the datasets. 2. Extract and process information at the edge: focusses on the use of AI systems at the edge instead of in the cloud through the integration of software and hardware and the development of algorithms that require less power and other resources. 3. Interact autonomously with other decision-making entities: focusses on the collaboration between different autonomous AI systems. 4. Communicate and collaborate seamlessly with humans: focusses on the natural interaction between humans and AI systems and the development of AI systems that can understand complex environments and can apply human-like reasoning.
- Promotor: Latré Steven
- Co-promotor: Calders Toon
- Co-promotor: Daelemans Walter
- Co-promotor: Goethals Bart
- Co-promotor: Hellinckx Peter
- Co-promotor: Laukens Kris
- Co-promotor: Martens David
- Co-promotor: Sijbers Jan
- Co-promotor: Steckel Jan
AbstractIn this project, we study the realization of an inductive database model. The most important steps in the realization of such a model are : a) a uniform representation of patterns and data; b) a query-language for querying the data and the patterns; c) the integration of existing optimization techniques into the physical layer.
AbstractThe aim of data mining is to find useful information, such as trends and patterns, from large databases. These databases often contain confidential or personal information. Therefore, it is important to assess to what degree the application of data mining techniques can harm the privacy of individuals. In this project, we want to develop methods that assess the degree of disclosure of private information by a data mining operation. Since complete methods probably have a too high complexity, we will also pay attention to incomplete, heuristic methods.
- Promotor: Calders Toon