黄 文浩 ( コウ ブンコウ )

Huang, Wenhao

写真a

所属(所属キャンパス)

政策・メディア研究科 ( 湘南藤沢 )

職名

特任助教(有期)

 

論文 【 表示 / 非表示

  • Object Size Classification in Garbage Disposal Sensing System Using Monocular Depth Estimation

    Ito T., Huang W., Chen Y., Nakazawa J.

    Communications in Computer and Information Science 2650 CCIS   190 - 208 2026年

    ISSN  18650929

     概要を見る

    Deep learning-based object detection is widely used in urban sensing, enabling tasks such as pedestrian, pothole, and waste detection. Automotive sensing with dashcams facilitates large-scale, real-time detection across urban environments. However, existing studies primarily focus on detection without estimating object size, which is crucial for event classification. Conventional size estimation methods rely on RGB-D cameras, multiple cameras, or LIDAR, making them unsuitable for large-scale automotive sensing with single RGB dashcams. Monocular depth estimation provides relative depth but does not yield absolute size measurements. To address this limitation, we propose a novel approach that combines monocular depth estimation with a reference object of known size. By comparing the detected object’s pixel dimensions with those of the reference object, its physical size can be estimated. To validate our approach, we developed an automotive sensing platform that detected and quantified household garbage bags using footage from the rear-view camera of garbage trucks. The truck body serves as the reference object, ensuring reliable size estimation. Experiments conducted with real-world data collected using an NVIDIA Jetson TX2 demonstrated the effectiveness of our method. The proposed approach achieves size estimation accuracy with mean squared errors (MSEs) of 20.02 for width and 18.68 for height while maintaining an end-to-end processing rate of 19.21 frames per second (FPS) for detection, tracking, and size estimation.

  • SORA-SORT: A Simple Occlusion Risk-Aware Framework for Multi-Object Tracking

    Kato K., Goto D., Huang W., Tsuge A., Okoshi T., Nakazawa J.

    Ubicomp Companion 2025 Companion of the 2025 ACM International Joint Conference on Pervasive and Ubiquitous Computing    251 - 255 2025年12月

     概要を見る

    To cost-effectively acquire people flow data, the use of bird's-eye view cameras that capture wide areas from a distance is being considered. However, videos captured by such cameras frequently exhibit dynamic occlusions, such as those caused by interactions among individuals, and static occlusions, such as those caused by fixed obstacles, leading to a decline in tracking performance. Therefore, this study develops a system to estimate dynamic and static occlusion risk using only detection and tracking information, proposing a general multi-object tracking (MOT) performance improvement method by dynamically controlling tracker parameters based on this risk. Comparative experiments against a baseline method demonstrated that our approach outperformed the baseline on most metrics, suggesting that our method contributes to enhancing tracking performance in occlusion-prone areas. However, an increase in ID switches (IDS) and a decrease in processing speed occurred. The increase in IDS is considered to be due to the unnecessary retention of tracks that should have been deleted due to lifespan extension. This result implies a trade-off between tracking continuity and the consistency of ID assignment.

  • JumpQ: Stochastic Scheduling to Accelerating Object-detection-driven Mobile Sensing on Object-sparse Video Data

    Mikami K., Huang W., Chen Y., Nakazawa J.

    ACM Sensys 2025 23rd ACM Conference on Embedded Networked Sensor Systems in Transactions to Conference Embedded Artificial Intelligence and Sensing Systems    332 - 344 2025年05月

     概要を見る

    Deep learning-based object detection has seen a surge in applications for sensing systems on mobile devices. In this context, objects are identified and tracked across video frames, facilitating the calculation of associated events of interest. A significant research challenge refers to the acceleration of processing speed, which is constrained by deep learning-based object detection due to its intensive resource requirements. This paper focuses on a typical mobile sensing scenario, wherein sequences of frames containing objects of interest are sparsely dispersed throughout the video stream. Given that many of the frames lack objects, allocating substantial computational resources to detect them becomes inefficient. In light of this, we propose a stochastic scheduling algorithm, JumpQ. JumpQ performs per-frame detection when anticipating the presence of objects in the current frames. Consecutive negative detections prompt a transition to intermittent detection with a probability that undergoes further decay if the negative detection persists until reaching a predefined limit. Upon a positive detection, JumpQ swiftly reverts to per-frame detection and retraces a specific number of previously buffered frames to ensure the inclusion of potentially missed true frames. A comprehensive experimental study using the garbage bag counting technique was conducted to show the efficiency of JumpQ in accelerating the processing speed by nearly 1.92 times while maintaining a negligible impact on sensing accuracy.

  • AdaLine: Adaptive Counting Line Optimization by Perspective-Aware Trajectory Modeling in Object-Detection-Tracking Systems

    Huang W., Chen Y., Nakazawa J., Okoshi T.

    IEEE Access 13   111282 - 111292 2025年

     概要を見る

    The combination of object detection, tracking, and counting has become a widely used method. The position and angle of cameras can vary according to the deployment scenarios, which affects counting accuracy. Traditional approaches often rely on manually pre-defined counting lines or regions-of-interest (ROIs), which are static, environment-specific, and difficult to generalize. To overcome these limitations, we propose AdaLine, an adaptive, perspective-aware algorithm that learns the optimal counting line from object trajectories, thereby enabling adaptation to diverse environments and camera viewpoints without manually defined counting lines. AdaLine adapts automatically as the scene evolves by clustering incoming trajectories with K-means, selecting the most stable line candidate, and smoothing it with an exponential moving average. Experimental evaluations across different scenarios and camera settings show that AdaLine achieves better performance in terms of accuracy, stability, and applicability. Our approach offers a scalable, real-time configuration-free solution for object-detection-tracking systems.

  • Forecasting Household Waste Generation with Deep Learning and Long-Term Granular Database

    Zhang Y., Huang W., Chen Y., Nakazawa J.

    ACM International Conference Proceeding Series    179 - 182 2024年03月

     概要を見る

    Forecasting household waste generation using conventional methods can present challenges due to the substantial variability and uncertainty in the process. Furthermore, previous studies focused on forecasting household waste generation at municipal or national levels may not be directly applicable to on-site waste collection processes. The objective of this research is to attain daily-level predictions of household waste generation and assess the advantages of using a leading-edge deep learning approach over conventional methods. We applied Multi-variable Long Short-Term Memory (LSTM) neural network utilizing a granular garbage disposal database for forecasting. This database is curated from a garbage disposal sensing platform currently operational in three cities within the Kanagawa prefecture, Japan, with plan for operation until 2025. Additionally, relevant web applications based on findings from this research will be developed for data visualization and routine optimization.

全件表示 >>

 

担当授業科目 【 表示 / 非表示

  • 情報基礎1

    2026年度

  • 情報基礎2

    2026年度