Jihwan Oh

Logo ericoh929 [at] kaist [dot] com

Google Scholar / Github /
Deep Learning

Hello, I am Jihwan Oh, an ai researcher at KAIST AI. My research area was Multi Agent Deep Reinforcement Learning [C1/J1/W2/W3] during my M.S. candidate period. Thesedays, I have centered LLMs of reasoning, self-improving, and test-time scaling [C3/W4]. Please refer to below papers.


Publications

C: Conference, W: Workshop, J: Journal, P: Preprint, D: Domestic.
*: Equal Contribution, ^: Equal Advising

Conference

[C3/W4] Understanding Bias Reinforcement in LLM Agents Debate
Jihwan Oh*, Minchan Jeong*, Jongwoo Ko^, Se-Young Yun^
ICML’25 [Paper]

[C2] Preference Alignment with Flow Matching
Minu Kim, Yongsik Lee, Sehyeok Kang, Jihwan Oh, Song Chong, Se-Young Yun
Neurips’24 [Paper]

[C1/W2] Toward Risk-based Optimistic Exploration for Cooperative Multi-Agent Reinforcement Learning
Jihwan Oh*, Joonkee Kim*, Minchan Jeong, Se-Young Yun
AAMAS’23 [Paper]

Workshop

[W4] When Debate Fails: Bias Reinforcement in Large Language Models
Jihwan Oh*, Minchan Jeong*, Jongwoo Ko, Se-Young Yun
Reasoning and Planning for Large Language Models Workshop at ICLR’25 [Paper]

[W3] Diffusion-based Episodes Augmentation for Offline Multi-Agent Reinforcement Learning
Jihwan Oh, Sungnyun Kim, Gahee Kim, Seonghwan Kim, Se-Young Yun
Structured Probabilistic Inference & Generative Modeling (SPIGM) Workshop at ICML’24 [Paper]

[W2] Risk Perspective Exploration in Distributional Reinforcement Learning
Jihwan Oh, Joonkee Kim, Se-Young Yun
AI for Agent Based Modelling Workshop at ICML’22 [Paper]

[W1] Real-time and Explainable Detection of Epidemics with Global News Data
Sungnyun Kim, Jaewoo Shin, Seongha Eom, Jihwan Oh, Se-Young Yun
Workshop on Healthcare AI and COVID-19, ICML’22 [Paper]

International Journal

[J1] The StarCraft Multi-Agent Exploration Challenges: Learning Multi-Stage Tasks and Environmental Factors without Precise Reward Functions
Mingyu Kim*, Jihwan Oh* Yongsik Lee, Joonkee Kim, Seonghwan Kim, Song Chong, Se-Young Yun
IEEE Access’23 [Paper]

Domestic Journal

[D5] A Study for Language Models as Agents for Strategic Decision-making Environments
Jihwan Oh, Se-Young Yun
한국인터넷정보학회 논문지’24 [Paper]

[D4] 협력적 다중 에이전트 강화학습을 위한 확산모델 기반 데이터 증강기법
Jihwan Oh, Sungnyun Kim, Gahee Kim, Se-Young Yun
(사)디지털산업정보학회 논문지’24 [Paper]

[D3] 국방품질경영체제 발전방안에 관한 연구: 합부체제에서 품질수준 측정체제로
Namsu Ahn, Jihwan Oh
국방품질연구논집 (JDQS)’24 [Paper]

[D2] Analysis of Security Policy Effectiveness Using Individual Utility Maximization Model Doyoung Kim, Jihwan Oh
국방과 보안’24 [Paper]

[D1] Risk Scheduling-based Optimistic Exploration for Distributional Reinforcement Learning
Jihwan Oh, Joonkee Kim, Se-Young Yun
정보과학회논문지’23 [Paper]


Patent

[1] 전장 상황에서의 방책 추천을 위한 강화학습 방법 및 시스템, 이를 위한 컴퓨팅 장치
KOR patent number: 10-2567-9280000
[2] 최적의 초소 위치 선정을 위한 시뮬레이션 방법 및 이를 위한 시뮬레이션장치
KOR patent number: 10-2624-7720000
[3] 무인 전투체계를 위한 다중 에이전트 강화학습 데이터 증강 장치 및 그 방법
KOR patent number: 10-2777-3920000


Education

[KAIST], Ph.D student in Graduate school of AI/ Seoul, South Korea/ 2024~ under professor Se-Young Yun
[KAIST], M.S. in Graduate school of AI/ Seoul, South Korea/ Feb 2023 under professor Se-Young Yun
[KMA], B.S. in Economics/ Seoul, South Korea/ Feb 2016