FuXi-beta

Official PyTorch implementation for FuXi-beta: Towards a Lightweight and Fast Large-Scale Generative Recommendation Model.

1. Paper

Yufei Ye, Wei Guo, Hao Wang, Hong Zhu, Yuyang Ye, Yong Liu, Huifeng Guo, Ruiming Tang, Defu Lian, and Enhong Chen. FuXi-beta: Towards a Lightweight and Fast Large-Scale Generative Recommendation Model. arXiv preprint arXiv:2508.10615, 2025.

Paper / PDF / Project Page / Citation

FuXi-beta studies efficiency bottlenecks in large-scale generative recommendation models and proposes a lightweight design for faster training and inference. The repository provides the PyTorch implementation and public MovieLens experiment configs.

2. Highlights

Targets training and inference efficiency for large-scale generative recommendation.
Builds on FuXi-alpha and HSTU-style sequential recommendation components.
Introduces lightweight attention/bias handling in the FuXi-beta sequential encoder.
Provides public MovieLens configs for reproducible experiments.

3. Method At A Glance

FuXi-beta method overview

FuXi-beta analyzes bottlenecks from relative temporal attention bias and query-key attention-map computation, then replaces expensive operations with a lightweight token-mixing path.

4. Repository Structure

.
├── configs/                                      # MovieLens experiment configs
├── generative_recommenders/modeling/sequential/  # FuXi-beta and baseline encoders
├── generative_recommenders/trainer/              # Training pipeline
├── main.py                                       # Distributed training entry
├── preprocess_public_data.py                     # MovieLens preprocessing
├── requirements.txt
└── docs/                                         # GitHub Pages project page

The FuXi-beta model code is under generative_recommenders/modeling/sequential/fuxi_beta.py.

5. Installation

Install PyTorch following the official instructions for your CUDA environment, then install dependencies:

pip install -r requirements.txt

The original quick setup used:

pip3 install gin-config absl-py scikit-learn scipy matplotlib numpy apex hypothesis pandas fbgemm_gpu iopath

6. Data

Prepare the public MovieLens data:

mkdir -p tmp/
python3 preprocess_public_data.py

7. Quick Start

Run FuXi-beta on MovieLens-1M:

CUDA_VISIBLE_DEVICES=0 python3 main.py \
  --gin_config_file=configs/ml-1m/fuxi-beta-sampled-softmax-n128-final.gin \
  --master_port=12345

Other configurations are included in configs/ml-1m/ and configs/ml-20m/.

8. Reproducing Results

A GPU with 24GB or more HBM should work for most public MovieLens settings.

Training logs are written to exps/ by default. Launch TensorBoard with:

tensorboard --logdir ~/generative-recommenders/exps/ml-1m-l200/ --port 24001 --bind_all
tensorboard --logdir ~/generative-recommenders/exps/ml-20m-l200/ --port 24001 --bind_all

9. Configuration Notes

configs/ml-1m/fuxi-beta-sampled-softmax-n128-final.gin: default MovieLens-1M FuXi-beta setting.
configs/ml-20m/fuxi-beta-sampled-softmax-n128-final.gin: default MovieLens-20M FuXi-beta setting.
Baseline configs for FuXi-alpha, HSTU, SASRec, and FLASH are included for comparison.

10. Experimental Highlights

FuXi-beta public and industrial results

FuXi-beta ablation and compatibility results

The paper tables above make the lightweight accuracy-cost tradeoff visible: FuXi-beta is compared with FuXi-alpha/HSTU/SASRec variants, then analyzed through attention and compatibility ablations.

FuXi-beta is positioned as a lightweight successor to heavier generative recommendation models. The code includes both FuXi-alpha and FuXi-beta components for direct architectural comparison.

Finding	Paper evidence	Takeaway
Industrial accuracy	On large-scale industrial datasets, FuXi-beta reports +27% to +47% NDCG@10 compared with FuXi-alpha.	The lightweight design is not only an efficiency change.
Public benchmark behavior	The paper reports performance comparable to prior state of the art on public datasets while significantly reducing training time.	FuXi-beta targets a better accuracy-cost balance.
Query-key attention ablation	Removing the query-key attention map improves MovieLens-1M NDCG@10 from 0.1871 to 0.1947 and reduces relative time from 1.000 to 0.951; on MovieLens-20M, NDCG@10 moves from 0.2097 to 0.2117 and time from 1.000 to 0.842.	The paper's efficiency claim is tied to a concrete architectural simplification.
Temporal attention ablation	Removing temporal attention hurts MovieLens-20M NDCG@10 from 0.2097 to 0.1863.	The lightweight model still depends on explicit temporal information.

Conclusion: FuXi-beta keeps the useful temporal structure from FuXi-alpha while removing expensive attention components that are less helpful for recommendation.

11. Notes For Maintainers

Keep fuxi_beta.py and the fuxi-beta-* config names stable because README commands depend on them.
If new benchmark configs are added, mirror them in both the README and project page.
Do not change preprocessing assumptions without verifying compatibility with the existing MovieLens configs.

12. Citation

If you find FuXi-beta useful, please cite:

@article{ye2025fuxibeta,
  title={FuXi-beta: Towards a Lightweight and Fast Large-Scale Generative Recommendation Model},
  author={Ye, Yufei and Guo, Wei and Wang, Hao and Zhu, Hong and Ye, Yuyang and Liu, Yong and Guo, Huifeng and Tang, Ruiming and Lian, Defu and Chen, Enhong},
  journal={arXiv preprint arXiv:2508.10615},
  year={2025}
}

13. Contact

First author: Yufei Ye (aboluo2003@mail.ustc.edu.cn).
Corresponding authors: Hao Wang (wanghao3@ustc.edu.cn), Yong Liu (liu.yong6@huawei.com), Defu Lian (liandefu@ustc.edu.cn), and Enhong Chen (cheneh@ustc.edu.cn).
Repository questions: please open a GitHub issue in this repository.