Sithu Aung

I was a M.S student at the Visual Intelligence Group, part of the Korea Institute of Science and Technology at University of Science and Technology, Korea, where I work on 3D computer vision. My M.S advisor is Junghyun Cho.

Email  /  GitHub  /  Google Scholar

profile photo


I'm interested in 3D Computer Vision, Scene Reconstruction & Understanding, Synthetic Data Generation.

project image

Multi-View Pedestrian Occupancy Prediction with a Novel Synthetic Dataset

Sithu Aung, Min-Cheol Sagong, Junghyun Cho
The 39th Annual AAAI Conference on Artificial Intelligence (AAAI), 2025
paper / website /

A new occupancy prediction dataset for dense pedestrians in multi-view environments.

project image

Enhancing Multi-view Pedestrian Detection Through Generalized 3D Feature Pulling

Sithu Aung, Haesol Park, Hyungjoo Jung, Junghyun Cho
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024
paper / youtube /

Effective multi-view feature pulling method for detecting dense pedestrian crowds with sparsely located CCTV cameras within a spacious scene.

Open-source Projects

I open-sourced a lot of works in my earlier days.

project image

Easy-to-use Face Analysis Tool

code /

Face analysis tool including face & landmark detection, recognition, facial expression recognition and facial attribute classification.

project image

A collection of SOTA Image Classification Models in PyTorch

code /

Intended for easy to use and integrate SOTA image classification models into down-stream tasks and finetuning with custom datasets.

project image

SOTA Segmentation Models in PyTorch

code /

Easy to use and customizable SOTA semantic segmentation models in PyTorch.

project image

Top-Down Multi-person Pose Estimation

code /

Multi-person pose estimation based on top-down method, incorporating real-time YOLO detector with SOTA pose estimation models.

project image

Object Tracking with YOLO, CLIP and DeepSORT

code /

Tracking by detection with real-time YOLO detector with classic DeepSORT tracking pipeline and zero-shot feature extractor, CLIP.

Design and source code from Jon Barron's website