Fast2Vec, a modified model of FastText that enhances semantic analysis in topic evolution

social media conept in word tag cloud
Social media concept in word tag cloud on white background

Author: Ayu Pertiwi, Azhari Azhari, Sri Mulyana

Combines semantic embedding and subword modeling for dynamic topic evolution (Branded name as hybrid AI-enhanced topic model)

Problem and Challenge

    Semantic topic modeling faces challenges in capturing nuanced word meanings, particularly in cases with negation, rare words, or synonymy. Standard models like LDA and DTM struggle with semantic coherence, while embedding models like Word2Vec and FastText have limitations in handling out-of-vocabulary (OOV) words or context insensitivity like shown in Figure 1.

    Figure 1. Visualizing Semantic Limitations in Traditional Topic Models

    Goal of Experimentation

      To develop Fast2Vec, a hybrid word embedding model that integrates Word2Vec and FastText, aiming to enhance semantic accuracy in dynamic topic modeling. The objective is to track topic trends and evolution patterns using improved word representations.

      Methods

        Fast2Vec combines Word2Vec and FastText embeddings through weighted summation (==0.5). DTM is used to model topics over time, while UMAP and Affinity Propagation support semantic clustering. Semantic similarity is evaluated using cosine similarity, Spearman, and Pearson correlation. 

        Architecture System

          The system workflow like shown in Figure 2 includes data preprocessing, Fast2Vec embedding generation, DTM-based topic extraction, dimensionality reduction (UMAP), semantic clustering (AP), and evolution tracking via entropy analysis. This pipeline enables interpretable and adaptive topic modeling.

          Figure 2. Architecture of System

          Results and Discussion

            Fast2Vec improves similarity by 39.64% over Word2Vec in OOV settings and outperforms FastText by 6.18%. It performs best in 7 out of 12 benchmark datasets. The model (Fig. 3) also successfully categorizes topic evolution patternsdiffusion, stability, shift, and moderate fluctuation validated through entropy-based trend analysis.

            Figure 3. Fast2Vec Outperforms on Semantic Shift and OOV

            Value Proposition

              Fast2Vec offers robust word representations that support fine-grained topic evolution tracking. Its integration of context and subword modeling makes it ideal for applications in NLP research, scientometrics, and semantic analysis over time. It bridges the gap between statistical modeling and semantic precision.

              Comments

              No comments yet. Why don’t you start the discussion?

              Leave a Reply

              Your email address will not be published. Required fields are marked *