Hi there, I am Zhanke Zhou, a Ph.D. student at Trustworthy Machine Learning and Reasoning (TMLR) Group in the Department of Computer Science, Hong Kong Baptist University, advised by Dr. Bo Han.
My research interests lie in trustworthy machine reasoning with structural data and foundation models for solving complex problems (e.g., combinatorial optimization and planning) and boosting scientific discoveries (e.g., math, biology, and chemistry).
Please feel free to email me for research, collaborations, or a casual chat.
📖 Educations
- 2022.09 - present, Hong Kong Baptist University (HKBU), Ph.D. in Computer Science.
- 2017.09 – 2021.06, Huazhong University of Science and Technology (HUST), B.E. in Electronics and Information Engineering (SeedClass).
📝 Publications on Trustworthy Language Model Reasoning
* Co-first author, ✉️ Corresponding author.
Can Language Models Perform Robust Reasoning in Chain-of-thought Prompting with Noisy Rationales?
Zhanke Zhou, Rong Tao, Jianing Zhu, Yiwen Luo, Zengmao Wang, Bo Han✉️.
In NeurIPS 2024.
[paper]
[code]
[slides]
[poster]
[EN-video]
[CN-video]
Quick Introduction
This paper investigates an under-explored challenge in large language models (LLMs): chain-of-thought prompting with noisy rationales, which include irrelevant or inaccurate reasoning thoughts within examples used for in-context learning. We construct NoRa dataset that is tailored to evaluate the robustness of reasoning in the presence of noisy rationales. Our findings on NoRa dataset reveal a prevalent vulnerability to such noise among current LLMs, with existing robust methods like self-correction and self-consistency showing limited efficacy. Notably, compared to prompting with clean rationales, base LLM drops by 1.4%-19.8% in accuracy with irrelevant thoughts and more drastically by 2.2%-40.4% with inaccurate thoughts.Addressing this challenge necessitates external supervision that should be accessible in practice. Here, we propose the method of contrastive denoising with noisy chain-of-thought (CD-CoT). It enhances LLMs' denoising-reasoning capabilities by contrasting noisy rationales with only one clean rationale, which can be the minimal requirement for denoising-purpose prompting. This method follows a principle of exploration and exploitation: (1) rephrasing and selecting rationales in the input space to achieve explicit denoising and (2) exploring diverse reasoning paths and voting on answers in the output space. Empirically, CD-CoT demonstrates an average improvement of 17.8% in accuracy over the base model and shows significantly stronger denoising capabilities than baseline methods.
DeepInception: Hypnotize Large Language Model to Be Jailbreaker.
Xuan Li*, Zhanke Zhou*, Jianing Zhu*, Jiangchao Yao, Tongliang Liu, Bo Han✉️
In NeurIPS 2024 SafeGenAI Workshop.
[paper]
[code]
[slides]
[twitter]
[CN-video]
[CN-blog]
[DeepTech]
Quick Introduction
Despite remarkable success in various applications, large language models (LLMs) are vulnerable to adversarial jailbreaks that make the safety guardrails void. However, previous studies for jailbreaks usually resort to brute-force optimization or extrapolations of a high computation cost, which might not be practical or effective.In this paper, inspired by the Milgram experiment w.r.t. the authority power for inciting harmfulness, we disclose a lightweight method, termed as DeepInception, which can hypnotize an LLM to be a jailbreaker. Specifically, DeepInception leverages the personification ability of LLM to construct a virtual, nested scene to jailbreak, which realizes an adaptive way to escape the usage control in a normal scenario.
Empirically, DeepInception can achieve competitive jailbreak success rates with previous counterparts and realize a continuous jailbreak in subsequent interactions, which reveals the critical weakness of self-losing on both opensource and closed-source LLMs like Falcon, Vicuna-v1.5, Llama-2, GPT-3.5, and GPT-4.
Envisioning Outlier Exposure by Large Language Models for Out-of-Distribution Detection.
Chentao Cao, Zhun Zhong, Zhanke Zhou, Yang Liu, Tongliang Liu, Bo Han✉️
In ICML 2024.
[paper]
[code]
[slides]
[poster]
[CN-video]
[CN-blog]
Quick Introduction
Detecting out-of-distribution (OOD) samples is essential when deploying machine learning models in open-world scenarios. Zero-shot OOD detection, requiring no training on in-distribution (ID) data, has been possible with the advent of vision-language models like CLIP. Existing methods build a text-based classifier with only closedset labels. However, this largely restricts the inherent capability of CLIP to recognize samples from large and open label space.In this paper, we propose to tackle this constraint by leveraging the expert knowledge and reasoning capability of large language models (LLM) to Envision potential Outlier Exposure, termed EOE, without access to any actual OOD data. Owing to better adaptation to open-world scenarios, EOE can be generalized to different tasks, including far, near, and fine-grained OOD detection.
Technically, we design (1) LLM prompts based on visual similarity to generate potential outlier class labels specialized for OOD detection, as well as (2) a new score function based on potential outlier penalty to distinguish hard OOD samples effectively. Empirically, EOE achieves state-of-the-art performance across different OOD tasks and can be effectively scaled to the ImageNet-1K dataset
📝 Publications on Trustworthy Graph Model Reasoning
On Strengthening and Defending Graph Reconstruction Attack with Markov Chain Approximation.
Zhanke Zhou, Chenyu Zhou, Xuan Li, Jiangchao Yao✉️, Quanming Yao, Bo Han✉️.
In ICML 2023.
[paper]
[code]
[slides]
[poster]
[EN-video]
[CN-video]
[CN-blog]
Quick Introduction
Although powerful graph neural networks (GNNs) have boosted numerous real-world applications, the potential privacy risk is still under-explored. To close this gap, we perform the first comprehensive study of graph reconstruction attack that aims to reconstruct the adjacency of nodes.We show that a range of factors in GNNs can lead to the surprising leakage of private links. Especially by taking GNNs as a Markov chain and attacking GNNs via a flexible chain approximation, we systematically explore the underneath principles of graph reconstruction attack, and propose two information theory-guided mechanisms: (1) the chain-based attack method with adaptive designs for extracting more private information; (2) the chain-based defense method that sharply reduces the attack fidelity with moderate accuracy loss.
Such two objectives disclose a critical belief that to recover better in attack, you must extract more multi-aspect knowledge from the trained GNN; while to learn safer for defense, you must forget more link-sensitive information in training GNNs. Empirically, we achieve state-of-the-art results on six datasets and three common GNNs.
Combating Bilateral Edge Noise for Robust Link Prediction.
Zhanke Zhou, Jiangchao Yao✉️, Jiaxu Liu, Xiawei Guo, Quanming Yao,
Li He, Liang Wang, Bo Zheng, Bo Han✉️.
In NeurIPS 2023.
[paper]
[code]
[slides]
[poster]
[EN-video]
[CN-video]
[CN-blog]
Quick Introduction
Although link prediction on graphs has achieved great success with the development of graph neural networks (GNNs), the potential robustness under the edge noise is still less investigated. To close this gap, we first conduct an empirical study to disclose that the edge noise bilaterally perturbs both input topology and target label, yielding severe performance degradation and representation collapse.To address this dilemma, we propose an information-theory-guided principle, Robust Graph Information Bottleneck (RGIB), to extract reliable supervision signals and avoid representation collapse. Different from the basic information bottleneck, RGIB further decouples and balances the mutual dependence among graph topology, target labels, and representation, building new learning objectives for robust representation against the bilateral noise.
Two instantiations, RGIB-SSL and RGIB-REP, are explored to leverage the merits of different methodologies, i.e., self-supervised learning and data reparameterization, for implicit and explicit data denoising, respectively. Extensive experiments on six datasets and three GNNs with diverse noisy scenarios verify the effectiveness of our RGIB instantiations.
Less is More: One-shot Subgraph Reasoning on Large-scale Knowledge Graphs.
Zhanke Zhou, Yongqi Zhang, Jiangchao Yao, Quanming Yao, Bo Han✉️.
In ICLR 2024.
[paper]
[code]
[slides]
[poster]
[EN-video]
Quick Introduction
To deduce new facts on a knowledge graph (KG), a link predictor learns from the graph structure and collects local evidence to find the answer to a given query. However, existing methods suffer from a severe scalability problem due to the utilization of the whole KG for prediction, which hinders their promise on large-scale KGs and cannot be directly addressed by vanilla sampling methods.In this work, we propose the one-shot-subgraph link prediction to achieve efficient and adaptive prediction. The design principle is that, instead of directly acting on the whole KG, the prediction procedure is decoupled into two steps, i.e., (i) extracting only one subgraph according to the query and (ii) predicting on this single, query-dependent subgraph.
We reveal that the non-parametric and computation-efficient heuristics Personalized PageRank (PPR) can effectively identify the potential answers and supporting evidence. With efficient subgraph-based prediction, we further introduce the automated searching of the optimal configurations in both data and model spaces. Empirically, we achieve promoted efficiency and leading performances on five large-scale benchmarks.
Neural Atoms: Propagating Long-range Interaction in Molecular Graphs
through Efficient Communication Channel.
Xuan Li*, Zhanke Zhou*, Jiangchao Yao, Yu Rong, Lu Zhang, Bo Han✉️.
In ICLR 2024.
[paper]
[code]
[slides]
[poster]
[EN-video]
[CN-video]
Quick Introduction
Graph Neural Networks (GNNs) have been widely adopted for drug discovery with molecular graphs. Nevertheless, current GNNs mainly excel in leveraging short-range interactions (SRI) but struggle to capture long-range interactions (LRI), both of which are crucial for determining molecular properties.To tackle this issue, we propose a method to abstract the collective information of atomic groups into a few Neural Atoms by implicitly projecting the atoms of a molecular. Specifically, we explicitly exchange the information among neural atoms and project them back to the atoms’ representations as an enhancement. With this mechanism, neural atoms establish the communication channels among distant nodes, effectively reducing the interaction scope of arbitrary node pairs into a single hop.
To provide an inspection of our method from a physical perspective, we reveal its connection to the traditional LRI calculation method, Ewald Summation. The Neural Atom can enhance GNNs to capture LRI by approximating the potential LRI of the molecular. We conduct extensive experiments on four long-range graph benchmarks, covering graph-level and link-level tasks on molecular graphs. We achieve up to a 27.32\% and 38.27\% improvement in the 2D and 3D scenarios, respectively. Empirically, our method can be equipped with an arbitrary GNN to help capture LRI.
Adaprop: Learning Adaptive Propagation for Graph Neural Network Based Knowledge Graph Reasoning.
Yongqi Zhang*, Zhanke Zhou*, Quanming Yao✉️, Xiaowen Chu, Bo Han.
In KDD 2023.
[paper]
[code]
[slides]
[poster]
[EN-video]
[CN-video]
Quick Introduction
Due to the popularity of Graph Neural Networks (GNNs), various GNN-based methods have been designed to reason on knowledge graphs (KGs). An important design component of GNN-based KG reasoning methods is called the propagation path, which contains a set of involved entities in each propagation step. Existing methods use hand-designed propagation paths, ignoring the correlation between the entities and the query relation. In addition, the number of involved entities will explosively grow at larger propagation steps.In this work, we are motivated to learn an adaptive propagation path in order to filter out irrelevant entities while preserving promising targets. First, we design an incremental sampling mechanism where the nearby targets and layer-wise connections can be preserved with linear complexity. Second, we design a learning-based sampling distribution to identify the semantically related entities. Extensive experiments show that our method is powerful, efficient and semantic-aware.
🎖 Awards
- 2024.10, Excellent Research Gold Award of TMLR Group.
- 2024.06, Best Poster Award by COMP of HKBU.
- 2024.05, Best Research Performance Award by COMP of HKBU.
- 2023.11, Research Excellence Award by COMP of HKBU.
- 2021.06, Honorary degree of HUST (Top 2%, highest honour for undergrad).
- 2021.06, Outstanding Graduate Award of HUST.
💬 Talks
- 2024.11, Seminar on Trustworthy Machine Learning and Foundation Models @AI Time, Online. [Video]
- 2023.11, Seminar on Trustworthy Machine Learning with Imperfect Data @TechBeat, Online. [Video]
- 2023.11, Youth PhD Talk on Trustworthy Machine Learning @AI Time, Online. [Video]
💻 Services
- Conference Reviewer for ICML, NeurIPS, ICLR, AISTATS, ACML, AAAI, IJCAI, COLM, CIKM, SIGKDD.
- Journal Reviewer for TMLR, NEUNET, TNNLS, TKDE.
🏫 Teaching
- Teaching Assistant for COMP7250: Machine Learning.
- Teaching Assistant for COMP3015: Data Communications and Networking.
- Teaching Assistant for COMP7070: Advanced Topics in Artificial Intelligence and Machine Learning.
📖 Experiences
- 2022.09 - present, PhD student @HKBU-TMLR Group, advised by Dr. Bo Han.
- 2022.02 - 2022.09, Research assistant @HKBU-TMLR Group, advised by Dr. Bo Han and Dr. Jiangchao Yao.
- 2021.01 - 2024.05, Visiting student @THU-LARS Group, advised by Dr. Quanming Yao and Dr. Yongqi Zhang.
- 2020.06 - 2020.09, Research intern @SJTU-MVIG Group, advised by Dr. Cewu Lu and Dr. Yonglu Li.
- 2018.03 - 2021.01, Core Member @HUST-Dian Group, advised by Dr. Yayu Gao, Dr. Chengwei Zhang, and Dr. Xiaojun Hei.
💻 Resources
I hold that life’s best resources, like air, should be free.
Hence, I champion open-source research and hope the following resources can benefit you :)
Projects
- Awesome-model-inversion-attack
- DeepInception
- FLDRL-in-Wireless-Communication
- AutoSF
- MC-GRA
- RGIB
- AdaProp
- KGTuner
- graph-ood-detection
- one-shot-subgraph
- NeuralAtom
- Awesome-Graph-Prompting
Source Files of My Talks or Posters
- All my posters are here. [slides-pdf] [slides-pptx]
- 20240420: Less is more: One-shot subgraph reasoning on large-scale knowledge graphs. [slides-pdf] [slides-pptx]
- 20240327: Experience sharing on my research career. [slides-pdf] [slides-pptx]
- 20231214: Robust graph information bottleneck. [slides-pdf] [slides-pptx]
- 20231114: Graph reconstruction attack. [slides-pdf] [slides-pptx]
- 20231109: Recent advances on LLM reasoning with graphs. [slides-pdf] [slides-pptx]
- 20230705: Paper reading of AAGOD. [slides-pdf] [slides-pptx]
- 20230705: Paper reading of CFLP. [slides-pdf] [slides-pptx]
- 20230207: Model inversion attack: From images to graphs. [slides-pdf] [slides-pptx]
- 20221028: A review of GNN explanation methods. [slides-pdf] [slides-pptx]
- 20221026: Paper reading of CoLE. [slides-pdf] [slides-pptx]
- 20220630: Paper reading of GSAT. [slides-pdf] [slides-pptx]
- 20220325: Learning query-dependent propagation for knowledge graph reasoning. [slides-pdf] [slides-pptx]
- 20211112: Paper reading of GraIL. [slides-pdf] [slides-pptx]
- 20210709: Understanding and benchmarking model search for knowledge graph embedding. [slides-pdf] [slides-pptx]
- 20211112: Paper reading of interstellar. [slides-pdf] [slides-pptx]
- 20201213: Paper reading of UAMT. [slides-pdf] [slides-pptx]