LaneLM: Lane Detection as Language Modeling

This article has 0 evaluations Published on
Read the full article Related papers
This article on Sciety

Abstract

Lane detection ensures that vehicles remain within drivable areas. However, the auto-annotation of lane detection often underperforms in corner cases. If the output of large visual language models or human feedback could be utilized as prompts in these corner cases, it would greatly enhance the quality of annotations. Nevertheless, existing lane detection models lack the capability for interaction. We present LaneLM, an interactive and promptable framework for lane detection. We are the pioneer who tackles lane detection as a visual question-answering task that progressively estimates different lane sequences in the same image through multi-turn conversations, where the current estimate is prompted by previous context and in turn affects subsequences. This prompt-based approach can generate the correct lanes with some keypoint prompts when the model fails in some corner cases, thus making it faster for image manual annotation and enabling the use of prompts similar to those in language models. Notably, using only 4-point prompts, LaneLM achieves state-of-the-art performance and easily surpasses previous SOTA methods, especially in corner cases. In particular, we get 82.71% F1-score on CULane. Despite its accuracy, LaneLM is lightweight with extremely low trainable parameters.

Related articles

Related articles are currently not available for this article.