Intellectual Property India Publishes Patent Application for A Hardware-Coupled Federated Dual-Level Mixture-Of-Experts Inference System Employing Sparse Rotary Gated Attention, Confidence-Calibrated Abstention, Provable Access-Mask Isolation, And A Gpu Kernel-Fusion Dispatch Pipeline

Posted On: 2026-06-26 Patentwipo

MUMBAI, India, June 26 -- Intellectual Property India has published a patent application (202641073125 A) filed by Aiconsortium Private Limited on June 12, 2026, for A Hardware-Coupled Federated Dual-Level Mixture-Of-Experts Inference System Employing Sparse Rotary Gated Attention, Confidence-Calibrated Abstention, Provable Access-Mask Isolation, And A Gpu Kernel-Fusion Dispatch Pipeline.

Inventor includes Maurya Vijayaramachandiran.

The application for the patent was published on June 19, 2026, under issue no. 25/2026.

Abstract: A hardware-coupled inference system (10) comprises a routing-node GPU server (20) and sovereign compute nodes (66) interconnected by a private-circuit mesh. A backbone transformer whose every feed-forward sublayer is a mixture-of-experts sublayer (28) cooperates with sparse rotary gated attention (24) activating a learned subset of heads, and a confidence-calibrated abstention router (26) emitting typed abstentions before any external dispatch. A backbone-detached three-tier router (40) selects sovereign-hosted experts under a negative-infinity access mask (48) of provably zero probability and zero gradient. A kernel-fusion pipeline (70) executes attention, expert evaluation, and dispatch packing as one GPU command graph (72); an occupancy controller (76) reserves streaming-multiprocessor headroom for DMA; a pre-fetch scheduler (80) warms remote caches one layer ahead. Approximately 1.02 trillion stored parameters are served at approximately 23 billion active per token, with occupancy of at least 85 per cent,

Disclaimer: Curated by HT Syndication.

Category

Intellectual Property India Publishes Patent Application for A Hardware-Coupled Federated Dual-Level Mixture-Of-Experts Inference System Employing Sparse Rotary Gated Attention, Confidence-Calibrated Abstention, Provable Access-Mask Isolation, And A Gpu Kernel-Fusion Dispatch Pipeline