Nonintrusive appliance state detection techniques estimate the operating state of appliances in a building using the building's aggregate energy consumption information only. Modern deep learning (DL) approaches have recently emerged as superior solutions to the above task. These approaches deploy individual models corresponding to the appliances whose states are to be identified. Though this solution enables the model to learn the appliance behavior accurately, it poses an additional burden on the computing device, say the smart meter, in terms of memory and computational time requirements. This article addresses the above problem by formulating the state detection task as a multilabel classification problem, where a single model predicts the operating state of multiple appliances. In particular, a novel, lightweight DL model consisting of dilated and causal convolution with multihead attention is proposed for efficient appliance state prediction. The dilated and causal convolution layer automatically extracts useful features from the aggregate data, and the attention layer uses those features selectively to learn the appliance states. The performance of the proposed model is validated in multiple scenarios using actual energy data collected from different buildings. The test results prove the model's feasibility and emerge as superior to various state-of-the-art multilabel classification techniques. Further, the model's benefit is highlighted by investigating a few ablation studies, computational complexities, and the effect of historical aggregate energy data on the model's performance.