World Scientific Publishing Co.: International Journal of Semantic Computing: Table of Contents Table of Contents for International Journal of Semantic Computing. List of articles from both the latest and ahead of print issues.
- Author Index Volume 18 (2024)on October 25, 2024 at 7:00 am
International Journal of Semantic Computing, Volume 18, Issue 04, Page 739-743, December 2024. <br/>
- AI-Based Cropping of Sport Videos Using SmartCropby Sayed Mohammad Majidi Dorcheh on August 27, 2024 at 7:00 am
International Journal of Semantic Computing, Volume 18, Issue 04, Page 637-662, December 2024. <br/> In the rapidly evolving landscape of digital platforms, the need for optimizing media representations to cater to various aspect ratios is palpable. In this paper, we pioneer an approach that utilizes object detection, scene detection, outlier detection, and interpolation for smart cropping. Using soccer as a case study, our primary goal is to capture the frame salience using object (player and ball) detection and tracking using AI models. To improve the object detection and tracking, we rely on scene understanding and explore various outlier detection and interpolation techniques. Our pipeline, called SmartCrop, is efficient, and supports various configurations for object tracking, interpolation, and outlier detection to find the best point-of-interest to be used as the cropping center of the video frame. An objective evaluation of the performance of individual pipeline components has validated our proposed architecture. Moreover, a crowdsourced subjective user study, assessing the alternative approaches for cropping from 16:9 to 1:1 and 9:16 aspect ratios, confirms that our proposed approach increases the end-user quality of experience.
- Human-Inspired Meta-Reinforcement Learning Using Bayesian Knowledge and Enhanced Deep Q-Networkby Joshua Ho on August 20, 2024 at 7:00 am
International Journal of Semantic Computing, Volume 18, Issue 04, Page 547-569, December 2024. <br/> Over the last decades, there has been growing interest in research in multiple and interdisciplinary fields of human-AI computing. In particular, approaches integrating human’s perspective and design with reinforcement learning (RL) have received more attention. However, the current research on RL may need to consider its enhancement from human-inspired approaches further. In this work, we focus on enabling a meta-reinforcement learning (meta-RL) agent to achieve adaptation and generalization, according to modeling Markov Decision Processes (MDP) using Bayesian knowledge and analysis. By introducing a novel framework called human-inspired meta-RL (HMRL), we incorporate the agent performing resilient actions to leverage the dynamic dense reward based on the knowledge and prediction of a Bayesian analysis. The proposed framework can make the agent learn generalization and prevent the agent from failing catastrophically. The experimental results show that our approach helps the agent reduce computational costs with learning adaptation. In addition to the system design, we have also extended further algorithmic improvement based on learning within a deep Q-network (DQN) implementations for more complicated future tasks, which compared replay buffers to possibly enhance the optimization process. Finally, we conclude and anticipate that integrating human-inspired meta-RL can enable learning more formulations relating to robustness and scalability, leading to promising directions and more complex AI goals in the future.
- Hierarchical Graph Neural Networks with Scale-Aware Readout for Image Classificationby João Pedro Oliveira Batisteli on August 17, 2024 at 7:00 am
International Journal of Semantic Computing, Volume 18, Issue 04, Page 713-738, December 2024. <br/> This work addresses the importance of incorporating multi-scale information in image representation by proposing a novel approach utilizing hierarchical segmentation and graph neural networks (GNNs). The proposed model, named Hierarchical Image Graph with Scale Importance (HIGSI), leverages hierarchical segmentation to construct graphs that capture relationships between nodes across different scales. This multi-scale representation simultaneously captures intricate details and global context, leading to a richer understanding of image structure than traditional methods. Additionally, a novel Region Graph Readout (RGR) function is introduced to assess the significance of each scale within the graph representation. By combining this multi-scale representation and the RGR function, HIGSI achieves competitive performance on image classification tasks, using smaller graphs or having fewer parameters than existing methods. This work also presents a comparative study with another hierarchical approach and an assessment of HIGSI’s components to investigate its decision-making process and its components’ contribution to the overall performance.
- NN-VVC: A Hybrid Learned-Conventional Video Codec Targeting Humans and Machinesby Jukka I. Ahonen on August 17, 2024 at 7:00 am
International Journal of Semantic Computing, Volume 18, Issue 04, Page 689-712, December 2024. <br/> Advancements in artificial intelligence have significantly increased the use of images and videos in machine analysis algorithms, predominantly neural networks. However, the traditional methods of compressing, storing and transmitting media have been optimized for human viewers rather than machines. Current research in coding images and videos for machine analysis has evolved in two distinct paths. The first is characterized by End-to-End (E2E) learned codes, which show promising results in image coding but have yet to match the performance of leading Conventional Video Codecs (CVC) and suffer from a lack of interoperability. The second path optimizes CVC, such as the Versatile Video Coding (VVC) standard, for machine-oriented reconstruction. Although CVC-based approaches enjoy widespread hardware and software compatibility and interoperability, they often fall short in machine task performance, especially at lower bitrates. This paper proposes a novel hybrid codec for machines named NN-VVC, which combines the advantages of an E2E-learned image codec and a CVC to achieve high performance in both image and video coding for machines. Our experiments show that the proposed system achieved up to − 43.20% and − 26.8% Bjøntegaard Delta rate reduction over VVC for image and video data, respectively, when evaluated on multiple different datasets and machine vision tasks according to the common test conditions designed by the VCM study group in MPEG standardization activities. Furthermore, to improve reconstruction quality, we introduce a human-focused branch into our codec, enhancing the visual appeal of reconstructions intended for human supervision of the machine-oriented main branch.