CIKM 2022

Workshops and Analyticup

The CIKM 2022 workshop program will host 11 compelling workshops and 2 Analyticups that highlight the breadth of interesting problems being explored in the fields of Information and Knowledge Management.

#
Title
Contact
Website
1
Applied Machine Learning Methods for Time Series Forecasting (AMLTS)
Linsey Pang
Email
Full Day
Organizers
Linsey Pang (Salesforce), Wei Liu (University of Technology Sydney) , Lingfei Wu (JD.COM), Kexin Xie(Salesforce), Stephen Guo (Walmart) , Raghav Chalapathy (Walmart) , Musen Wen (Walmart)
Abstract
Time series data is ubiquitous, and accurate time series forecasting is vital for many real-world application domains, including retail, healthcare, supply chain, climate science, e-commerce and economics. The choice of machine learning methods, both conventional and deep learning-based models, primarily depends on the nature of input data. In addition, several models have been adopted in industries with great success. The goals of our workshop are: Highlight the real-world scalability challenges to learning from the vast amount of time-series data on industrial-scale applications (e.g., sampling methodology, causal inference, uncertainty quantification, anomaly detection); Discuss recent developments in focussing on applied algorithms for tackling scalability problems; and Explore newly applied methods in time series analysis and their connections with emerging fields such as causal discovery and machine learning for science. In light of the recent pandemic and geopolitical events, we also plan to emphasize anomaly detection methods and their influence on time series forecasting accuracy at our workshop. Time series modeling has a long custom of innovative approaches from many cross-disciplines, including statistics and the data mining community. Forecasting, in general, has led to broad impact and a diverse range of applications, making it an ideal topic for the timely dissemination of new ideas at CIKM. We hope that the diversity and expertise of our speakers and attendees will help uncover new approaches and break new ground in these challenging and vital settings. We envision our workshop will continue to appeal to the CIKM audience and stimulate many interdisciplinary discussions.
2
The 1st International Workshop on Federated Learning with Graph Data (FedGraph)
Carl Yang
Email
Full Day
Organizers
Carl Yang (Emory University), Xiaoxiao Li (University of British Columbia), Nathalie Baracaldo (IBM Research), Neil Shah (Snap Research), Chaoyang He (FedML), Lingjuan Lyu (Sony AI) , Lichao Sun (Lehigh University) and Salman Avestimehr (University of Southern California)
Abstract
The field of graph data mining, one of the most important AI research areas, has been revolutionized by graph neural networks (GNNs), which benefit from training on real-world graph data with millions to billions of nodes and links. Unfortunately, the training data and process of GNNs involving graphs beyond millions of nodes are extremely costly on a centralized server, if not impossible. Moreover, due to the increasing concerns about data privacy, emerging data from realistic applications are naturally fragmented, forming distributed private graphs of multiple “data silos”, among which direct transferring of data is forbidden. The nascent field of federated learning (FL), which aims to enable individual clients to jointly train their models while keeping their local data decentralized and completely private, is a promising paradigm for large-scale distributed and private training of GNNs. The FedGraph workshop aims to bring together researchers from different backgrounds with a common interest in how to extend current FL algorithms to operate with graph data models such as GNNs. FL is an extremely hot topic of large commercial interest and has been intensively explored for machine learning with visual and textual data. The exploration from graph mining researchers and industrial practitioners is timely catching up just recently. There are many unexplored challenges and opportunities, which urges the establishment of an organized and open community to collaboratively advance the science behind it. The prospective participants of this workshop will include researchers and practitioners from both graph mining and federated learning communities, whose interests include, but are not limited to: graph analysis and mining, heterogeneous network modeling, complex data mining, large-scale machine learning, distributed systems, optimization, meta-learning, reinforcement learning, privacy, robustness, explainability, fairness, ethics, and trustworthiness.
3
AIMLAI: Advances in Interpretable Machine Learning and Artificial Intelligence
Luis Galarraga
Email
Full Day
Organizers
Adrien Bibal (Université catholique de Louvain), Tassadit Bouadi (University of Rennes I), Benoît Frénay (Université de Namur), Luis Galárraga (Inria) and José Oramas (University of Antwerp)
Abstract
Recent technological advances rely on accurate decision support systems that can be perceived as black boxes due to their overwhelming complexity. This lack of transparency can lead to technical, ethical, legal, and trust issues. For example, if the control module of a self-driving car failed at detecting a pedestrian, it becomes crucial to know why the system erred. In some other cases, the decision system may reflect unacceptable biases that can generate distrust. The General Data Protection Regulation (GDPR), approved by the European Parliament in 2018, suggests that individuals should be able to obtain explanations of the decisions made from their data by automated processing, and to challenge those decisions. All these reasons have propelled research in interpretable and explainable AI/ML. AIMLAI aims at gathering researchers, experts and professionals, from inside and outside the domain of AI, interested in the topic of interpretable or explainable AI/ML. The workshop encourages interdisciplinary collaborations, with particular emphasis in knowledge management, infovis, human computer interaction, and psychology. It also welcomes applied research for use cases where understanding the model matters.
4
TrustLOG: The First Workshop on Trustworthy Learning on Graphs
Dawei Zhou
Email
Full Day
Organizers
Dawei Zhou (Virginia Tech), Jingrui He (University of Illinois at Urbana-Champaign), Jian Kang (University of Illinois at Urbana-Champaign), Bo Li (University of Illinois at Urbana-Champaign), Jian Pei (Simon Fraser University) and Shuaicheng Zhang (Virginia Tech)
Abstract
Learning on graphs is at the core of many domains, ranging from information retrieval, and social network analysis to transportation and computational chemistry. Years of research in this area have developed a wealth of theories, algorithms, and open-source systems for a variety of learning tasks. State-of-the-art graph learning models have been widely deployed in various real-world applications, often delivering superior empirical performance in answering what/who questions. For example, what are the most relevant web pages with respect to a user query? Who can be grouped into the same community? What items should we recommend to best-fit user preferences? Despite the prosperous development of high-utility graph learning models, recent studies reveal that learning on graphs is not trustworthy in many aspects. For example, existing methods make decisions in a black-box manner, which hinders the end-users to understand and trust model decisions. Many commonly applied approaches are also found to be vulnerable to malicious attacks, biased against individuals from certain demographic groups, or insecure to information leakage. As such, a fundamental question largely remains nascent: how can we make learning algorithms on graphs trustworthy? To answer this question, it is crucial to propose a paradigm shift, from answering what/who to understanding how/why, e.g., how the ranking of webpages can be manipulated by the malicious link farms; why two seemingly different users are grouped into the same online community; how sensitive the recommendation results are due to the random noises or fake ratings.
5
The Third workshop on Data-driven Intelligent Transportation
Hua Wei
Email
Half Day AM
Organizers
Hua Wei (New Jersey Institute of Technology), Guni Sheron (Texas A&M University), Cathy Wu (MIT), Sanjay Chawla (Qatar Computing Research Institute) and Zhenhui Li (Yunqi Academy of Engineering)
Abstract
Traffic is the pulse of the city. Transportation systems can involve humans, vehicles, shipments, information technology, and the physical infrastructure, all interacting in complex ways. Intelligent transportation enables the city to function in a more efficient and effective way. A wide range of city data become increasingly available, such as taxi trips, surveillance camera data, human mobility data from mobile phones or location-based services, events from social media, car accident reports, bike-sharing information, Points-Of-Interest, traffic sensors, public transportation data, and many more. This abundance of data poses a grand challenge to the CIKM research community: How to utilize such data toward city intelligence, across various transportation tasks? The 3rd workshop of "Data-driven Intelligent Transportation" welcomes articles and presentations in the areas of transportation systems, data mining, and artificial intelligence, conveying new advances and developments in theory, modeling, simulation, testing, and case studies, as well as large-scale deployment.
6
Workshop on Human-in-the-loop Data Curation
Gianluca Demartini
Email
Full Day
Organizers
Gianluca Demartini (The University of Queensland), Shazia Sadiq (The University of Queensland) and Jie Yang (Delft University of Technology)
Abstract
Although data quality is a long-standing and enduring problem, it have recently received a resurgence of attention due to the fast proliferation of data analytics, machine learning, and decision-support applications built upon the wide-scale availability and accessibility of (big) data. The success of such applications heavily relies on not only the quantity, but also the quality of data. Data curation, which may include ingestion, annotation, cleaning, integration, etc., is a critical step to provide adequate assurances on the quality of analytics and machine learning results. Such data preparation activities are recognised as time and resource intensive for data scientists as data often comes with a number of challenges that need to be tackled before it can be used in practice. Data re-purposing and the resulting distance between design and use intentions of the data, is a fundamental issue behind many of these challenges. These challenges include a variety of data issues such as noise and outliers, incompleteness, representativeness or biases, heterogeneity of format or semantics, etc. Mishandling these challenges can lead to negative and sometimes damaging effects, especially in critical domains like healthcare, transport, and finance. An observable distinct feature of data quality in these contexts is the increasingly important role played by humans, being often the source of data generation and the active players in data curation. This workshop will provide an opportunity to explore the interdisciplinary overlap between manual, automated, and hybrid human-machine methods of data curation.
7
The 1st International Workshop on Privacy Algorithms in Systems (PAS)
Olivera Kotevska
Email
Half Day PM
Organizers
Philip Yu (University of Illinois Chicago), Olivera Kotevska (Oak Ridge National Laboratory) and Tyler Derr (Vanderbilt University)
Abstract
Today we face an explosion of data generation, ranging from health monitoring to national security infrastructure systems. More and more systems are connected to the Internet that collects data at regular time intervals. These systems share data and use machine learning methods for intelligent decisions, which resulted in numerous real-world applications (e.g., autonomous vehicles, recommendation systems, and heart-rate monitoring) that have benefited from it. However, these approaches are prone to identity thief and other privacy related cyber-security attacks. So, how can data privacy be protected efficiently in these scenarios? More dedicated efforts are needed to propose the integration of privacy techniques into existing systems and develop more advanced privacy techniques to address the complex challenges of multi-system connectivity and data fusion. Therefore, we propose the PAS at CIKM’22, which provides a venue to gather academic researchers and industry researchers/practitioners to present their research in an effort to advance the frontier of this critical direction of privacy algorithms in systems. We will host our proposed work at the following url: hxxps://pasworkshop.github.io/.
8
Deep Learning for Search and Recommendation
Wei Liu
Email
Full Day
Organizers
Wei Liu (University of Technology Sydney), Kexin Xie (Salesforce), Linsey Pang (Salesforce),Yuxi Zhang (Salesforce), James Bailey (The University of Melbourne) and Longbing Cao (University of Technology Sydney)
Abstract
In the current digital world, web search engines and recommendation systems are continuously evolving, opening new potential challenges every day which require more sophisticated and efficient data mining and machine learning solutions to satisfy the needs of sellers and consumers as well as marketers. The quality of search and recommendation systems impacts customer retention, time on site, and sales volume. For instance, with often sparse conversion rates, highly personalized contents, and heterogeneous digital sources, more rigorous and effective models are required to be developed by research engineers and data scientists. At the same time, deep learning has started to show great impact in many industrial applications which are capable of processing complicated, large-scale, and real-time data. Deep learning not only provides more opportunities to increase conversion rates and improve revenue through a positive customer experience, but also provides customers with personalized contents along with their personal shopping journey. Due to this rapid growth of the digital world, there is a need to bring professionals together from both academic research and the industry to solve real-world problems. This is exactly what this workshop aims to achieve. Topics of this workshop include deep learning based query understanding, personalization, representation learning, product retrieval, recommendation algorithm, ranking algorithms, etc.
9
THECOG - Transforms in behavioral and affective computing
Georgios Drakopoulos
Email
Half Day PM
Organizers
Georgios Drakopoulos (Ionio University) and Eleanna Kafeza (Zayed University)
Abstract
Human decision making is central in many functions across a broad spectrum of fields including marketing, investments and smart contracts, digital health, political campaigns, logistics, and strategic management to name only a few. Computational behavioral science, the focus of the proposed workshop, not only studies the various psychological, cultural, and social factors contributing to decision making besides reasoning, but it also seeks to construct robust, scalable, and efficient computational models imitating or extending this way of decision making. It should be highlighted here that computational behavioral science does not negate the rationality axioms of classical economic theory but rather extends them. Computational behavioral science can evaluate the individual or collective decision making processes in massive populations with signal estimation or deep learning techniques based on a wide array of attributes ranging from social media posts and multimedia to physiological signs and neuroimaging results. Additionally, time dependent decision making processes can be understood and processed in a signal processing context and even tracked with input-output or state space models. As disposition towards alternative decisions may well change over time, this is a major advantage compared to traditional decision making analysis. So far the primary findings in the field are concepts like bounded rationality and perceived risk, while results include optimal strategies for various levels of information awareness and action strategies based on perceived loss aversion principles which have been successfully applied to many situations. THECOG for a second continuous year will be a central meeting point for researchers for generating new interdisciplinary and groundbreaking results.
10
The 2nd Workshop on Mixed-Initiative ConveRsatiOnal Systems (MICROS)
Ida Mele
Email
Half Day AM
Organizers
Ida Mele (IASI-CNR), Cristina Ioana Muntean (ISTI-CNR), Mohammad Aliannejadi (University of Amsterdam) and Nikos Voskarides (University of Amsterdam)
Abstract
The Mixed-Initiative ConveRsatiOnal Systems workshop (MICROS) aims at bringing novel ideas and investigating new solutions on conversational assistant systems. The increasing popularity of personal assistant systems, as well as smartphones, has changed the way users access online information, posing new challenges for information seeking and filtering. MICROS will have a particular focus on mixed-initiative conversational systems, namely, systems that can provide answers in a proactive way (e.g., asking for clarification or proposing possible interpretations for ambiguous and vague requests). We will invite people working on conversational systems or interested in the workshop topics to send us their position and research manuscripts.
11
Workshop on Proactive and Agent-Supported Information Retrieval
Procheta Sen
Email
Half Day PM
Organizers
Gareth Jones (Dublin City University), Procheta Sen (University College London), Debasis Ganguly (University of Glasgow) and Emine Yilmaz (University College London)
Abstract
Established information retrieval (IR) systems are generally reactive in that they respond to active entry of a search query by a user. Information is thus only provided to the user when they identify a need for information and invest the effort to address this need using a search engine. As such, there may be considerable cost to users in satisfying their information need, and they may fail to access available useful information if they do not actively look for it. In contrast reactive systems, Proactive Information Retrieval (PIR) systems seek to retrieve relevant content without the user explicitly entering a query A PIR system seeks to do this by using a combination of observed user activities, their context and potentially a profile of interests and activities, to create search queries, perform search operations and present retrieved results. A PIR system would operate in the background while the user undertakes their tasks, interrupting the user when potential useful information has been identified. A middle ground between standard reactive IR and PIR, is provided by agent-supported IR, where a proactive agent can provide assistance within a user-driven IR system, e.g to support query creation or navigation of retrieved results.
12
Analyticup - EvalRS: A ROUNDED EVALUATION OF RECOMMENDER SYSTEMS
Jacopo Tagliabue
Email
Half Day AM
Organizers
Jacopo Tagliabue (NYU) Federico Bianchi (Stanford), Tobias Schnabel (Microsoft), Giuseppe Attanasio (Bocconi University), Ciro Greco (SPC), Gabriel de Souza P. Moreira (NVIDIA), Patrick John Chia (Coveo)
Abstract
Recommender Systems (RSs) are used as part of complex applications and affect user experience through a varied range of user interfaces. However, research has mostly focused on the ability of RSs to produce accurate item rankings, while giving little attention to the evaluation in real-world scenarios. Such narrow focus has limited the capacity of RSs to have a lasting positive impact, and makes them vulnerable to undesired behavior, such as reinforcing biases. The EvalRS workshop presents the results of the first-of-its-kind data challenge, in which recommenders are evaluated on several dimensions and testing methodologies are built in the open. Through word-class invited speakers and paper presentations, the workshop aims to foster a principled discussion on RS evaluation and fairness, and release reusable artifacts to the community for better testing at scale.