About

I’m a Mechatronics Engineer; graduated from the Mechatronics Engineering Department at Tishreen University back in 2022, where I focused primarily on Deep Reinforcement Learning under the supervision of Doctor Iyad Hatem. Previously, I worked as a Research Assistant at the Robotics Club of Tishreen University under the guidance of Doctor Essa Alghannam, where I worked on time series and computer vision research. In 2023, I became a Teaching Assistant for CS courses at the Mechatronics Department.

I’m broadly interested in Computer Science Research and currently applying for a Ph.D. degree in Computer Science in the U.S., focusing primarily on AI. Additionally, I take a keen interest in Reinforcement Learning, Statistical RL, and Deep RL. My research goal is to see complex, human-like behavior emerge from unsupervised interaction between groups of learning agents with an application’s focus on game theory, and techniques for decision-making (planning and learning) that enable single-situated agents and teams of agents to act intelligently in their environments and exhibit goal-directed behavior in real-time. Concretely this leads to a lot of questions I’m currently interested in:

How can we use RL to design models of human agents? How can we ensure that RL-designed agents are human-compatible?
How can we synthesize environments that push and test the capabilities of our agents?
What algorithmic advances and software tools are needed to address these questions?

In practice this means working on understanding how to push the state of the art in multi-agent RL algorithms, designing new data-driven simulators, and trying to deploy simulator-designed controllers into real-world systems.

I’m also interested in state-of-the-art RL and NLP integration Research, including:

RL for generative models, e.g., fine-tune LLMs with RL, The Transformer RL, and the IL library.
Algorithmic and theoretical foundations of RLHF, e.g., offline RLHF, contextual dueling bandits with active query.
RL with offline and online data, e.g., hybridRL.
Representation learning in RL, e.g., theory of representation learning in RL, and theory of representation transfer in RL.
Learning in partially observable systems, e.g., PAC RL w/ general function approximation in POMDPs and PSRs.

Aya Shabbar