Projects
For more details, please click on the ABS and Code sections. My full name is Jiwoong Choi, but I go by Gio.
2025
- Language Models Surface the Unwritten Code of Science and SocietyHonglin Bao, Siyang Wu*, Jiwoong Choi*, Yingrong Mao*, and James A. Evans(Under Review) NeurIPS 2025, Dec 2025
This paper calls on the research community not only to investigate how human biases are inherited by large language models (LLMs) but also to explore how these biases in LLMs can be leveraged to make society’s "unwritten code" - such as implicit stereotypes and heuristics - visible and accessible for critique. We introduce a conceptual framework through a case study in science: uncovering hidden rules in peer review - the factors that reviewers care about but rarely state explicitly due to normative scientific expectations. The idea of the framework is to push LLMs to speak out their heuristics through generating self-consistent hypotheses - why one paper appeared stronger in reviewer scoring - among paired papers submitted to 45 computer science conferences, while iteratively searching deeper hypotheses from remaining pairs where existing hypotheses cannot explain. We observed that LLMs’ normative priors about the internal characteristics of good science extracted from their self-talk, e.g. theoretical rigor, were systematically updated toward posteriors that emphasize storytelling about external connections, such as how the work is positioned and connected within and across literatures. This shift reveals the primacy of scientific myths about intrinsic properties driving scientific excellence rather than extrinsic contextualization and storytelling that influence conceptions of relevance and significance. Human reviewers tend to explicitly reward aspects that moderately align with LLMs’ normative priors (correlation = 0.49) but avoid articulating contextualization and storytelling posteriors in their review comments (correlation = -0.14), despite giving implicit reward to them with positive scores. We discuss the broad applicability of the framework, leveraging LLMs as diagnostic tools to surface the tacit codes underlying human society, enabling more precisely targeted responsible AI.
- Academic Simulacra: Forcasting Research Ideas through Multi-Agent LLM SimulationsJiwoong Choi, Donghyun Kang, Yingrong Mao, and James Evans(Poster Presentation) ACM Collective Intelligence; Extended Abstract, Aug 2025
Keywords: Multi-Agent Simulation, Simulated Scholarship, Large Language Models
We introduce a multi-agent simulation framework for forecasting research ideas using "scholar agents" powered by large language models (LLMs). We instantiate approximately 2,686 scholar agents based on their publication histories prior to 2024 and simulate discussions to collectively generate key research ideas for 1,400 papers targeting seven major computer science conferences in 2024. We then evaluate the proximity of these generated ideas by comparing their semantic embeddings with those of the actual target papers written by the corresponding researchers. Our results suggest that LLM-based multi-agent simulations yield substantially higher similarity scores with real publications than two baselines: (1) the average pairwise similarity among papers within the same 2024 conference, and (2) a random set of past papers from the same conference. This demonstrates the predictive capacity of our scholar agent framework. We then further analyze how diversity in ethnic composition and institutional affiliations may correlate with the predictability of research, or inversely, the degree of surprise relative to the past. Our preliminary analysis suggests that the least predictable and thus most surprising research ideas emerge from teams affiliated with Chinese institutions but not composed of ethnically Chinese authors. These findings offer promising initial evidence that simulating knowledge-driven scholar agents can anticipate directions of scientific discovery and help explain the influence of social and institutional factors on innovation. - Automating Scholarly JudgmentJiwoong Choi, Siyang Wu, Yingrong Mao, and Honglin Bao(Oral Presentation) International Conference on the Science of Science and Innovation (ICSSI), Sep 2025
Keywords: Automated Hypothesis Generation, Scientific Evaluation, Large Language Models
What constitutes good science remains a longstanding question in both philosophy and practice. Traditional peer review, for instance, has been critiqued for subjectivity, inconsistency, and potential bias. Recent advances in large language models (LLMs) offer a novel opportunity to re-examine this question at scale. Here, we analyze a dataset of approximately 27K papers submitted to 45 computer science conferences, paired on review scores to create clear distinctions in perceived quality. Rather than manually defining the criteria of “good” science, we task LLMs with iteratively proposing, testing, and refining hypotheses that explain why one paper might be judged as stronger than another. This yields a final pool of 20 orthogonal hypotheses with high coverage of pairs. Throughout this abductive reasoning process, the LLM’s initial “normative” prior beliefs (e.g., a good paper has high novelty) are updated into a posterior that reflects more professional-science criteria (e.g., a good paper tells a good story). LLMs could serve as powerful tools for uncovering latent patterns in how experts judge scientific work. Nevertheless, challenges remain. Interpretability is a critical bottleneck: while the iterative process yields human-understandable hypotheses, it relies on opaque LLM reasoning under the hood. In addition, substantial progress is still needed in guiding LLMs and humans toward a clearer understanding of what constitutes truly valuable science.
2024
- Korea Discount and Corporate GovernanceSK Kim, Ye Jun Kim, and Jiwoong ChoiMorgan Stanley Capital International (MSCI Inc., Sep 2024
This paper examines the "Board and Ownership and Control" Key Issue within the MSCI ESG Ratings Corporate Governance Theme, with a focus on board independence and adherence to the one-share-one-vote (OSOV) principle. Our analysis reveals significant governance challenges among Korean companies. Notably, less than half of the directors on Korean company boards are independent, a figure significantly lower than the global average of 66%. Additionally, 82% of Korean companies were flagged for Related Party Transactions (RPT), with smaller firms exhibiting lower board independence. Furthermore, Korean companies that deviated from the OSOV principle demonstrated weaker financial performance, with both return-on-equity (ROE) and price-to-book ratio (PBR) falling below the market average. These findings underscore the need for enhanced governance practices within Korean corporations to align more closely with global standards.
- ASEAN Gender Outlook 2024Statistics-Jiwoong ChoiUN General Assembly (UNGA), Sep 2024
- Despite high primary and lower secondary education completion rates in ASEAN, only 64 percent of students complete upper secondary education, with boys, particularly in rural areas, being more likely to drop out due to economic barriers and opportunity costs. While girls tend to stay in school longer, challenges such as inadequate access to employment opportunities, poor educational infrastructure, and disparities between urban and rural areas persist, highlighting the need for increased investments in education across the region.
- Adolescent birth rates in South-East Asia have decreased from 41 to 35 per 1,000 between 2015 and 2024, with factors such as delayed marriage, access to contraceptives, and education contributing to this decline. However, disparities in education infrastructure, particularly in rural areas, remain a challenge, with limited access to basic facilities like sanitation and water, which increases the likelihood of teenage pregnancy and school dropout among rural girls.
- Despite South-East Asia being one of the world’s safest regions with a homicide rate of 1.8 per 100,000 people, a growing sense of insecurity, particularly among women, has emerged due to factors like COVID-19, economic disruptions, and crime, highlighting the need for enhanced law enforcement and inclusive security approaches.
- Over the past decade, official development assistance (ODA) for gender equality in the ASEAN region has increased significantly, with 47% of all ODA in 2022 supporting gender-focused initiatives, though investments directly targeting gender equality have declined, highlighting the need for continued and expanded funding to sustain progress in areas like women’s participation, violence reduction, and the gender-environment nexus.
2022
- Outdoor visual SLAM and Path Planning for Mobile-RobotSeongil Heo, Jueun Mun, Jiwoong Choi, Jiwon Park, and Eric T. MatsonIEEE International Conference on Robotic Computing (IRC), Dec 2022
This paper proposes a robust visual SLAM and a path planning algorithm for autonomous vehicles in the outdoor environment. The consideration of the outdoor characteristics was essential in both SLAM and path planning processes. This study can be used when it is necessary to know the exact appearance of the environment due to the impossibility of observing the environment through a satellite map, e.g., inside a forest. The visual SLAM system was developed using GPS data in consideration of the deterioration of camera recognition performance outdoors. The GPS data was inserted into every multi-thread of visual SLAM, which are Camera Tracking, Local Mapping, and Loop Closing. It enhanced the accuracy of the map and saved computational power by preventing useless calculations. In the path planning part, our method divided the path based on the stability of the roads. When determining the optimal path, the stability of the road and the driving time were considered, and the weight was assigned based on the GPS data.
2021
- Stock Investment Opinion Sentimental AnalysisMirae Asset Big Data Hackathon, Nov 2021
- Collected text data from YouTube videos, YouTube comments, News, and Bank Reports by STT, OCR, and Crawling methods.
- Fine-tuned Google’s ELECTRA model which is a GAN-based transformer model by PyTorch.
- Conducted a Sentimental Analysis and made a prototype service.
- Gave a presentation on behalf of our team and finally achieved 4th place out of 1,000.