Indrajeet Aditya Roy

Stabilizing Quantized KV Caches via Subspace Embeddings and Manifold Interpolation

paper

Large Language Models (LLMs) function as autoregressive predictors, computing attention over a cached history of key-value projections to avoid the O(T^2) cost of recomputation. As context lengths grow, memory constraints necessitate aggressive quantization of these caches to low-precision integer representations. This quantization introduces a metric mismatch between storage and semantic spaces: perturbations measured by Hamming distance in the storage domain do not correspond to bounded perturbations in Euclidean distance after dequantization. A linear-algebraic framework is proposed for stabilizing quantized KV caches using classical error-correcting codes. Quantized values are embedded into higher-dimensional subspaces over F_2 via the generator matrices of Hamming and Golay codes, enabling O(1) syndrome-based detection and correction of bit-flip errors before values are lifted back to R^d. For errors exceeding the code's correction capacity, a geometric recovery strategy is employed that exploits the local smoothness of cache sequences through linear interpolation of detected erasures. Experiments on GPT-2 and LLaMA-3.1-8B demonstrate that at bit error rates up to 10^-2, unprotected caches diverge catastrophically, while protected caches maintain baseline performance.

Mitigating Deanonymization and DoS Attacks in Decentralized Networks

paper

The Onion Router (Tor) enables anonymous TCP communication through layered encryption and decentralized relay routing, but its open participation model exposes the network to adversarial exploitation. This paper analyzes two critical attack vectors targeting Tor's architecture—Sybil-based deanonymization, in which an adversary injects malicious relays to compromise circuit anonymity, and Cellflood denial-of-service, which overwhelms relay resources through malformed cell injection. A multi-layered defense framework is proposed that combines enhanced Directory Authority monitoring with IP-based admission control and client-side cryptographic puzzles. Evaluation demonstrates that the proposed defenses achieve robust resistance to both attack classes while preserving Tor's anonymity guarantees and decentralized trust model.

Improving Document Summarization Through Advanced Language Techniques: Fine tuning LLMs, and Retrieval-Augmented Generation

paper

Abstractive summarization of scholarly documents remains challenging due to sparse domain-specific training data and the difficulty of preserving factual consistency across long inputs. This work presents an integrated approach that combines Retrieval-Augmented Generation (RAG) with parameter-efficient fine-tuning of the Mistral 7B language model. A RAG pipeline backed by Neo4j knowledge graphs and FAISS vector stores provides dynamic factual grounding over research paper content, while Low-Rank Adaptation (LoRA) and Parameter-Efficient Fine-Tuning (PEFT) adapt the generative model to the linguistic conventions of academic text. The combined system mitigates hallucination and data scarcity, producing concise, context-rich summaries that capture the essential content of source documents.

Vision-Language Integration in LLMs: A Survey of Architectures, Training Paradigms, and Applications

paper

Vision-Language Models (VLMs) extend Large Language Models with the ability to jointly process visual and textual information, enabling tasks such as image captioning, visual question answering, and cross-modal reasoning. This survey provides a comprehensive examination of VLM architectures, training paradigms, and applications, systematically tracing the architectural progression from early dual-stream encoders with cross-modal attention to modern unified single-stream frameworks. Key developments in visual feature extraction, feature space organization, pretraining strategies, and cross-modal learning objectives—including contrastive, masking, and generative methods—are analyzed. The survey further examines emerging evaluation frameworks and identifies open challenges in architectural efficiency, scalability, and real-world deployment.

Dynamic Resource Aware Task Scheduling for Mobile Edge Cloud Computing

paper

Energy and performance-aware task scheduling in Mobile Cloud Computing (MCC) requires jointly optimizing offloading decisions across energy consumption, execution latency, and deadline constraints. Existing algorithms rely on static power models, fixed network conditions, and predetermined task characteristics, limiting their practical applicability. This paper presents an enhanced scheduling framework that integrates dynamic power profiles accounting for device operating states and workload-dependent consumption, models realistic network conditions with time-varying throughput and signal-strength-dependent energy costs, and extends the binary device-cloud architecture into a three-tier model incorporating edge computing resources. A Q-learning approach for task migration is further employed to adaptively explore the expanded solution space and identify energy-efficient offloading patterns, bridging the gap between theoretical MCC scheduling algorithms and practical deployment requirements.