Author: Sherly Christina, Azhari Azhari, Yohanes Suyanto
Problem and Challenge
How can we truly understand what people care about when it comes to energy? Public discussions on platforms like Reddit, X, Facebook and Kaskus contain a wealth of valuable insights. Most of these texts are unstructured and lack labels, making it challenging for policymakers, especially in low-resource countries, to use them effectively for energy decisions.
Figure 1. shows public discussions on online platforms reflect the complexity of public concerns about energy issues. From fuel prices and electricity access to energy security to environmental impacts and economic burdens, all are part of a narrative that is crucial for policymakers to capture and understand.
 Goal and Innovation
This study introduces HybridAI-Energy, a smart system that blends two powerful methods: an unsupervised machine learning pipeline and a rule-based linguistic analysis. The goal is to automatically extract important energy-related terms (called aspect terms) from public discourse, without needing annotated data, so that citizen concerns can be more directly reflected in policy.
How It Works
The system runs two parallel pipelines:
- Unsupervised Approach: Uses topic modeling (LDA), BERT-based sentence embeddings, autoencoders for dimension reduction, and K-means clustering to discover thematic patterns.
- Rule-Based Approach: Applies linguistic rules to identify nouns and noun phrases that often indicate public concerns in sentence structures.
Candidate terms from both methods are refined using:
- PMI (Pointwise Mutual Information) to detect how strongly a term is associated with a sentence.
- Cosine Similarity to check semantic closeness between the term and its sentence.
Figure 2 illustrates the architecture of the hybrid system used for processing text to extract terms that reflect citizen concerns related to energy.
Results and Insights
Using over 8,893 sentences from online forums, the system successfully extracted key public concerns like fuel price, renewable energy, economic hardship, and air pollution. These were mapped to five strategic national energy themes:
- Sustainable energy
- Economic problems
- Environmental risks
- Energy access
- Energy security
The hybrid model outperformed standalone methods in four key areas: lexical diversity, balance of distribution, semantic coherence, and contextual relevance.
Value Proposition
HybridAI-Energy is a novel and transferable tool for mining structured insights from messy, unlabeled public text. Its key strengths include:
- Policy support: Helps governments understand and respond to citizens’ real concerns.
- Scalability: Works without needing expensive labeled datasets.
- Interpretable AI: Offers clear and meaningful outputs that align with national energy goals.
This system bridges AI and linguistics to make public voices more visible in energy policymaking—particularly in low-resource environments where citizen input is often overlooked.