Simin Chen

Simin Chen

Postdoctoral Researcher at Columbia University

Computer Science Department at Columbia University

Biography

I am a postdoctoral researcher in the Computer Science Department at Columbia University, working with Prof.Baishakhi Ray on research related to large language models for code (LLM4Code). I earned my Ph.D. from the University of Texas at Dallas (UTD), and I was fortunate to be advised by Prof.Wei Yang and Prof.Cong Liu. Before joining UTD, I received my master degree from Tongji University in May 2018. My research interest lies in machine learning, computer security, and program analysis.

πŸ“’ Prospective Students: I’m actively looking for self-motivated students to join my research group at GMU CS. If you are interested in

(1) LLMs for Code / Software Engineering,

(2) Trustworthy AI Systems,

You are also welcome to drop me an email with subject line “Research Internship Application - [Your Name]” to discuss potential research opportunities at siminchen.phd@gmail.com.

Download my resumΓ©.

Interests
  • Machine Learning
  • Computer Security
  • Software Engineering
Education
  • Ph.D., 2019 - 2024

    The University of Texas at Dallas

  • Master, 2015 - 2018

    Tongji University

  • Bachelor, 2011 - 2015

    Tongji University

Publications

(2026). Your Compiler is Backdooring Your Model: Understanding and Exploiting Compilation Inconsistency Vulnerabilities in Deep Learning Compilers. In S&P 2026 πŸ† Distinguished Paper Award.

PDF Code

(2026). IAG: Input-aware Backdoor Attack on VLM-based Visual Grounding. In CVPR 2026.

(2026). FedRE: A Representation Entanglement Framework for Model-Heterogeneous Federated Learning. In CVPR 2026.

(2026). CodeSense: a Real-World Benchmark and Dataset for Code Semantic Reasoning. In ICLR 2026.

(2026). From Assistant to Independent Developer β€” Are GPTs Ready for Software Development?. In ICLR 2026.

(2025). PARD: Enhancing Goodput for Inference Pipeline via Proactive Request Dropping. In EuroSys 2026.

(2025). TITLE_TODO -- please fill in the EMNLP 2025 paper title. In EMNLP 2025.

(2025). SOK: Efficiency Robustness of Dynamic Deep Learning Systems. In USENIX Security 2025.

(2025). FDPT: Federated Discrete Prompt Tuning for Black-Box Visual-Language Models. In ICCV 2025.

(2025). Medusa: A Framework for Collaborative Development of Foundation Models with Automated Parameter Ownership Assignment. In FSE 2025.

(2024). DeciX: Explain Deep Learning Based Code Generation Applications. In ESEC/FSE 2024.

(2024). PPM: Automated Generation of Diverse Programming Problems for Benchmarking Code Generation Models. In ESEC/FSE 2024.

PDF Code

(2023). Dynamic Transformer Provide a False Sense of Efficiency. In ACL 2023.

(2023). The Dark Side of Dynamic Routing Neural Networks: Towards Efficiency Backdoor Injection. In CVPR 2023.

(2023). Sibling-Attack: Rethinking Transferable Adversarial Attacks against Face Recognition. In CVPR 2023.

(2023). DyCL: Dynamic Neural Network Compilation Via Program Rewriting and Graph Optimization. In ISSTA 2023.

(2022). NMTSloth: Understanding and Testing Efficiency Degradation of Neural Machine Translation Systems. In ESEC/FSE 2022.

PDF Code

(2022). Learning to Reverse DNNs from AI Programs Automatically. In IJCAI 2022.

PDF

(2022). NICGSlowDown: Evaluating the Efficiency Robustness of Neural Caption Generation Models. In CVPR 2022.

PDF Code

(2020). DENAS: automated rule generation by knowledge extraction from neural networks. In ESEC/FSE 2020.

PDF Code DOI

Experience

 
 
 
 
 
Research Assistant
Amazon Web Service
May 2023 – Aug 2023 Arlington Area, VA
Applying large language model for Cedar authorization policy language.
 
 
 
 
 
Research Assistant
Microsoft Research
May 2021 – Jul 2020 Seattle
Evaluate the model leakage risk of on-device DNNs.
 
 
 
 
 
Research Assistant
NEC Laboratories America
Jan 2020 – May 2020 New Jersey
Apply ML techniques for program analysis.

Contact