Research
My research focuses on Large Language Models (LLM) and Deep Learning (DL). I mainly work on LLM for Genomics. I have also worked several projects on conversational AI and user modeling.
|
|
USE: Dynamic User Modeling with Stateful Sequence Models
Zhihan Zhou*, Qixiang Fang*, Leonardo Neves, Yozen Liu, Francesco Barbieri, Han Liu, Maarten W. Bos, Ron Dotsch (*: co-first author)
Preprint, 2024
paper
Seamlessly integrating past and present user behavior sequences for user embedding.
|
|
DNABERT-S: Learning Species-Aware DNA Embedding with Genome Foundation Models
Zhihan Zhou*, Weimin Wu*, Harrison Ho, Jiayi Wang, Lizhen Shi, Ramana V Davuluri, Zhong Wang, Han Liu (*: co-first author)
Preprint, 2024
paper /
code /
model
DNABERT-S is a foundation model based on DNABERT-2 specifically designed for generating DNA embedding that naturally clusters and segregates genome of different species in the embedding space.
|
|
General-Purpose User Modeling with Behavioral Logs: A Case Study with Snapchat
Qixiang Fang*, Zhihan Zhou*, Francesco Barbieri, Yozen Liu, Leonardo Neves, Dong Nguyen, Daniel L. Oberski, Maarten W. Bos, Ron Dotsch (*: co-first author)
SIGIR 2024
paper
A Transformer-based model that learns general-purpose user embedding from SnapChat user behavioral logs.
|
|
Efficient Action Robust Reinforcement Learning with Probabilistic Policy Execution Uncertainty
Guanlin Liu, Zhihan Zhou, Han Liu, Lifeng Lai
Preprint, 2023
paper
A novel approach for solving action robust RL problems with probabilistic policy execution uncertainty.
|
|
DNABERT-2: Efficient Foundation Model and Benchmark For Multi-Species Genome
Zhihan Zhou, Yanrong Ji, Weijian Li, Pratik Dutta, Ramana Davuluri, Han Liu
ICLR 2024
paper /
code /
model
We introduce DNABERT-2, an efficient and effective foundation model for multi-species genome that achieves state-of-the-art performance with 20 time less parameters. We also provide a benchmark Genome Understanding Evaluation (GUE) containing 28 datasets across 7 tasks.
|
|
KGML-xDTD: A Knowledge Graph-based Machine Learning Framework for Drug Treatment Prediction and Mechanism Description
Chunyu Ma, Zhihan Zhou, Han Liu, David Koslicki
GigaScience, 2023
code
Optimizing Graph Neural Networks and Reinforcement Learning for link prediction and path finding on one of the largest biomedical knowledge graphs.
|
|
Learning Dialogue Representations from Consecutive Utterances
Zhihan Zhou, Dejiao Zhang, Wei Xiao, Nicholas Dingwall, Xiaofei Ma, Andrew O Arnold, Bing Xiang
NAACL 2022
paper /
code
Introduce DSE, a pre-trained language model that achieves great performance in few-shot dialogue understanding. Trained by contrastive learning on consecutive utterances of dialogues.
|
|
Trade the Event: Corporate Events Detection for News-Based Event-Driven Trading
Zhihan Zhou, Liqian Ma, Han Liu
ACL 2021, Findings
paper /
video /
code
Analyze the impact of news articles on the stock market. Consider corporate events are the driven force and introduce methods for corporate event detection. Provide new datasets for text-based stock prediction and event detection.
|
|
DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome
Yanrong Ji*, Zhihan Zhou*, Han Liu, Ramana V Davuluri (*: co-first author)
Bioinformatics, 2021
paper /
code /
model
Introduce new paradigm for DNA analysis. Pre-trained on human genome and achieves state-of-the-art performance in a wide range of DNA analysis problems including promoter prediction and splice prediction.
|
|
Few-shot Slot Tagging with Collapsed Dependency Transfer and Label-enhanced Task-adaptive Projection Network
Yutai Hou, Wanxiang Che, Yongkui Lai, Zhihan Zhou, Yijia Liu, Han Liu, Ting Liu
ACL 2020
paper /
code
Introduce novel method for few-shot sequence labeling by incorporating label information and learned label transfer into few-shot prediction.
|
|
Joint Speaker Diarization and Recognition Using Convolutional and Recurrent Neural Networks
Zhihan Zhou, Yichi Zhang, Zhiyao Duan
ICASSP 2018
paper
Combines speaker identity predictions and speaker change rate predictions for joint speaker diarization and recognition.
|
|
ELLPMDA: Ensemble learning and link prediction for miRNA-disease association prediction
Xing Chen*, Zhihan Zhou*, Yan Zhao (*: co-first author)
RNA Biology, 2018
paper
Perform ensemble learning on multiple graph learning method to achieve accurate miRNA-disease association prediction.
|
This website is built upon link. Thank Jon Barron for releasing source code and Qinjie for helping.
|
|