HowToKB is the first large-scale knowledge base which represents how-to (task) knowledge. Each task is represented by a frame with attributes for parent task, preceding sub-task, following sub-task, required tools or other items, and linkage to visual illustrations.
Our methodology first applies Open-IE techniques to WikiHow articles, in order to extract - noisy and ambiguous - candidates for task and sub-tasks. Subsequently, we use judiciously devised clustering techniques to clean and organize these candidates, and to infer attribute values. To canonicalize tasks and sub-tasks, we leverage word embeddings to distinguish different meanings of the same phrase (e.g., "use keyboard").