| On Calibration of Large Language Models: From Response To Capability | cs.AI | 0 | Open |
| ACE-Bench: A Lightweight Benchmark for Evaluating Azure SDK Usage Correctness | cs.AI | 0 | Open |
| AISA: Awakening Intrinsic Safety Awareness in Large Language Models against Jailbreak Attacks | cs.AI | 0 | Open |
| Privacy-Concealing Cooperative Perception for BEV Scene Segmentation | cs.AI | 0 | Open |
| Discrete-Space Generative AI Pipeline for Semantic Transmission of Signals | cs.AI | 0 | Open |
| OpAgent: Operator Agent for Web Navigation | cs.AI | 0 | Open |
| Mitigating the Safety-utility Trade-off in LLM Alignment via Adaptive Safe Context Learning | cs.AI | 0 | Open |
| Who Do LLMs Trust? Human Experts Matter More Than Other LLMs | cs.AI | 0 | Open |
| LLM-Confidence Reranker: A Training-Free Approach for Enhancing Retrieval-Augmented Generation Systems | cs.AI | 0 | Open |
| Elo-Evolve: A Co-evolutionary Framework for Language Model Alignment | cs.AI | 0 | Open |
| Rubrics as an Attack Surface: Stealthy Preference Drift in LLM Judges | cs.AI | 0 | Open |
| PoultryLeX-Net: Domain-Adaptive Dual-Stream Transformer Architecture for Large-Scale Poultry Stakeholder Modeling | cs.AI | 0 | Open |
| Differentiable Rule Induction from Raw Sequence Inputs | cs.AI | 0 | Open |
| A First Proof Sprint | cs.AI | 0 | Open |
| Two-Stream Interactive Joint Learning of Scene Parsing and Geometric Vision Tasks | cs.AI | 0 | Open |