Abstract: Given a task and a set of steps composing it, Video Step Grounding (VSG) aims to detect which steps are performed in a video. Standard approaches for this task require a labeled training set ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results