一二三四社区在线视频社区,在线视频免费观看www动漫,图片区亚洲色图

簡述

本文首先將介紹在目標跟蹤任務中常用的匈牙利算法（Hungarian Algorithm）和卡爾曼濾波（Kalman Filter），然后介紹經典算法DeepSORT的工作流程以及對相關源碼進行解析。

目前主流的目標跟蹤算法都是基于Tracking-by-Detecton策略，即基于目標檢測的結果來進行目標跟蹤。DeepSORT運用的就是這個策略，上面的視頻是DeepSORT對人群進行跟蹤的結果，每個bbox左上角的數字是用來標識某個人的唯一ID號。

這里就有個問題，視頻中不同時刻的同一個人，位置發生了變化，那么是如何關聯上的呢？答案就是匈牙利算法和卡爾曼濾波。

匈牙利算法可以告訴我們當前幀的某個目標，是否與前一幀的某個目標相同。
卡爾曼濾波可以基于目標前一時刻的位置，來預測當前時刻的位置，并且可以比傳感器（在目標跟蹤中即目標檢測器，比如Yolo等）更準確的估計目標的位置。

匈牙利算法（Hungarian Algorithm）

首先，先介紹一下什么是分配問題（Assignment Problem）：假設有N個人和N個任務，每個任務可以任意分配給不同的人，已知每個人完成每個任務要花費的代價不盡相同，那么如何分配可以使得總的代價最小。

舉個例子，假設現在有3個任務，要分別分配給3個人，每個人完成各個任務所需代價矩陣（cost matrix）如下所示（這個代價可以是金錢、時間等等）：

怎樣才能找到一個最優分配，使得完成所有任務花費的代價最小呢？

匈牙利算法（又叫KM算法）就是用來解決分配問題的一種方法，它基于定理：

如果代價矩陣的某一行或某一列同時加上或減去某個數，則這個新的代價矩陣的最優分配仍然是原代價矩陣的最優分配。

算法步驟（假設矩陣為NxN方陣）：

對于矩陣的每一行，減去其中最小的元素
對于矩陣的每一列，減去其中最小的元素
用最少的水平線或垂直線覆蓋矩陣中所有的0
如果線的數量等于N，則找到了最優分配，算法結束，否則進入步驟5
找到沒有被任何線覆蓋的最小元素，每個沒被線覆蓋的行減去這個元素，每個被線覆蓋的列加上這個元素，返回步驟3

繼續拿上面的例子做演示：

step1 每一行最小的元素分別為15、20、20，減去得到：

step2 每一列最小的元素分別為0、20、5，減去得到：

step3 用最少的水平線或垂直線覆蓋所有的0，得到：

step4 線的數量為2，小于3，進入下一步；

step5 現在沒被覆蓋的最小元素是5，沒被覆蓋的行（第一和第二行）減去5，得到：

被覆蓋的列（第一列）加上5，得到：

跳轉到step3，用最少的水平線或垂直線覆蓋所有的0，得到：

step4：線的數量為3，滿足條件，算法結束。顯然，將任務2分配給第1個人、任務1分配給第2個人、任務3分配給第3個人時，總的代價最小（0+0+0=0）：

所以原矩陣的最小總代價為（40+20+25=85）：

sklearn里的linear_assignment()函數以及scipy里的linear_sum_assignment()函數都實現了匈牙利算法，兩者的返回值的形式不同：

import numpy as np 
from sklearn.utils.linear_assignment_ import linear_assignment
from scipy.optimize import linear_sum_assignment




cost_matrix = np.array([
    [15,40,45],
    [20,60,35],
    [20,40,25]
])


matches = linear_assignment(cost_matrix)
print('sklearn API result:
', matches)
matches = linear_sum_assignment(cost_matrix)
print('scipy API result:
', matches)




"""Outputs
sklearn API result:
 [[0 1]
  [1 0]
  [2 2]]
scipy API result:
 (array([0, 1, 2], dtype=int64), array([1, 0, 2], dtype=int64))
"""

在DeepSORT中，匈牙利算法用來將前一幀中的跟蹤框tracks與當前幀中的檢測框detections進行關聯，通過外觀信息（appearance information）和馬氏距離（Mahalanobis distance），或者IOU來計算代價矩陣。

源碼解讀：

#  linear_assignment.py
def min_cost_matching(distance_metric, max_distance, tracks, detections, 
                      track_indices=None, detection_indices=None):
    ...


    # 計算代價矩陣
    cost_matrix = distance_metric(tracks, detections, track_indices, detection_indices)
    cost_matrix[cost_matrix > max_distance] = max_distance + 1e-5


    # 執行匈牙利算法，得到匹配成功的索引對，行索引為tracks的索引，列索引為detections的索引
    row_indices, col_indices = linear_assignment(cost_matrix)


    matches, unmatched_tracks, unmatched_detections = [], [], []


    # 找出未匹配的detections
    for col, detection_idx in enumerate(detection_indices):
        if col not in col_indices:
            unmatched_detections.append(detection_idx)


    # 找出未匹配的tracks
    for row, track_idx in enumerate(track_indices):
        if row not in row_indices:
            unmatched_tracks.append(track_idx)


    # 遍歷匹配的(track, detection)索引對
    for row, col in zip(row_indices, col_indices):
        track_idx = track_indices[row]
        detection_idx = detection_indices[col]
        # 如果相應的cost大于閾值max_distance，也視為未匹配成功
        if cost_matrix[row, col] > max_distance:
            unmatched_tracks.append(track_idx)
            unmatched_detections.append(detection_idx)
        else:
            matches.append((track_idx, detection_idx))


    return matches, unmatched_tracks, unmatched_detections

卡爾曼濾波（Kalman Filter）

卡爾曼濾波被廣泛應用于無人機、自動駕駛、衛星導航等領域，簡單來說，其作用就是基于傳感器的測量值來更新預測值，以達到更精確的估計。

假設我們要跟蹤小車的位置變化，如下圖所示，藍色的分布是卡爾曼濾波預測值，棕色的分布是傳感器的測量值，灰色的分布就是預測值基于測量值更新后的最優估計。

在目標跟蹤中，需要估計track的以下兩個狀態：

均值(Mean)：表示目標的位置信息，由bbox的中心坐標 (cx, cy)，寬高比r，高h，以及各自的速度變化值組成，由8維向量表示為 x = [cx, cy, r, h, vx, vy, vr, vh]，各個速度值初始化為0。
協方差(Covariance )：表示目標位置信息的不確定性，由8x8的對角矩陣表示，矩陣中數字越大則表明不確定性越大，可以以任意值初始化。

卡爾曼濾波分為兩個階段：(1)預測track在下一時刻的位置，(2) 基于detection來更新預測的位置。

下面將介紹這兩個階段用到的計算公式。（這里不涉及公式的原理推導，因為我也不清楚原理(?_?) ，只是說明一下各個公式的作用）

預測

基于track在t-1時刻的狀態來預測其在t時刻的狀態。

在公式1中，x為track在t-1時刻的均值，F稱為狀態轉移矩陣，該公式預測t時刻的x'：

矩陣F中的dt是當前幀和前一幀之間的差，將等號右邊的矩陣乘法展開，可以得到cx'=cx+dt*vx，cy'=cy+dt*vy...，所以這里的卡爾曼濾波是一個勻速模型（Constant Velocity Model）。

在公式2中，P為track在t-1時刻的協方差，Q為系統的噪聲矩陣，代表整個系統的可靠程度，一般初始化為很小的值，該公式預測t時刻的P'。

源碼解讀：

#  kalman_filter.py
def predict(self, mean, covariance):
    """Run Kalman filter prediction step.


    Parameters
    ----------
    mean: ndarray, the 8 dimensional mean vector of the object state at the previous time step.
    covariance: ndarray, the 8x8 dimensional covariance matrix of the object state at the previous time step.


    Returns
    -------
    (ndarray, ndarray), the mean vector and covariance matrix of the predicted state. 
     Unobserved velocities are initialized to 0 mean.
    """
    std_pos = [
        self._std_weight_position * mean[3],
        self._std_weight_position * mean[3],
        1e-2,
        self._std_weight_position * mean[3]]
    std_vel = [
        self._std_weight_velocity * mean[3],
        self._std_weight_velocity * mean[3],
        1e-5,
        self._std_weight_velocity * mean[3]]


    motion_cov = np.diag(np.square(np.r_[std_pos, std_vel]))  # 初始化噪聲矩陣Q
    mean = np.dot(self._motion_mat, mean)  # x' = Fx
    covariance = np.linalg.multi_dot((self._motion_mat, covariance, self._motion_mat.T)) + motion_cov  # P' = FPF(T) + Q


    return mean, covariance

更新

基于t時刻檢測到的detection，校正與其關聯的track的狀態，得到一個更精確的結果。

在公式3中，z為detection的均值向量，不包含速度變化值，即z=[cx, cy, r, h]，H稱為測量矩陣，它將track的均值向量x'映射到檢測空間，該公式計算detection和track的均值誤差；

在公式4中，R為檢測器的噪聲矩陣，它是一個4x4的對角矩陣，對角線上的值分別為中心點兩個坐標以及寬高的噪聲，以任意值初始化，一般設置寬高的噪聲大于中心點的噪聲，該公式先將協方差矩陣P'映射到檢測空間，然后再加上噪聲矩陣R；

公式5計算卡爾曼增益K，卡爾曼增益用于估計誤差的重要程度；

公式6和公式7得到更新后的均值向量x和協方差矩陣P。

源碼解讀：

#  kalman_filter.py
def project(self, mean, covariance):
    """Project state distribution to measurement space.
        
    Parameters
    ----------
    mean: ndarray, the state's mean vector (8 dimensional array).
    covariance: ndarray, the state's covariance matrix (8x8 dimensional).
    Returns
    -------
    (ndarray, ndarray), the projected mean and covariance matrix of the given state estimate.
    """
    std = [self._std_weight_position * mean[3],
           self._std_weight_position * mean[3],
           1e-1,
           self._std_weight_position * mean[3]]
        
    innovation_cov = np.diag(np.square(std))  # 初始化噪聲矩陣R
    mean = np.dot(self._update_mat, mean)  # 將均值向量映射到檢測空間，即Hx'
    covariance = np.linalg.multi_dot((
        self._update_mat, covariance, self._update_mat.T))  # 將協方差矩陣映射到檢測空間，即HP'H^T
    return mean, covariance + innovation_cov


def update(self, mean, covariance, measurement):
    """Run Kalman filter correction step.
    Parameters
    ----------
    mean: ndarra, the predicted state's mean vector (8 dimensional).
    covariance: ndarray, the state's covariance matrix (8x8 dimensional).
    measurement: ndarray, the 4 dimensional measurement vector (x, y, a, h), where (x, y) is the 
                 center position, a the aspect ratio, and h the height of the bounding box.
    Returns
    -------
    (ndarray, ndarray), the measurement-corrected state distribution.
    """
    # 將mean和covariance映射到檢測空間，得到Hx'和S
    projected_mean, projected_cov = self.project(mean, covariance)
    # 矩陣分解（這一步沒看懂）
    chol_factor, lower = scipy.linalg.cho_factor(projected_cov, lower=True, check_finite=False)
    # 計算卡爾曼增益K（這一步沒看明白是如何對應上公式5的，求線代大佬指教）
    kalman_gain = scipy.linalg.cho_solve(
            (chol_factor, lower), np.dot(covariance, self._update_mat.T).T,
            check_finite=False).T
    # z - Hx'
    innovation = measurement - projected_mean
    # x = x' + Ky
    new_mean = mean + np.dot(innovation, kalman_gain.T)
    # P = (I - KH)P'
    new_covariance = covariance - np.linalg.multi_dot((kalman_gain, projected_cov, kalman_gain.T))
        
    return new_mean, new_covariance

DeepSort工作流程

DeepSORT對每一幀的處理流程如下：

檢測器得到bbox → 生成detections → 卡爾曼濾波預測→ 使用匈牙利算法將預測后的tracks和當前幀中的detecions進行匹配（級聯匹配和IOU匹配） → 卡爾曼濾波更新

Frame 0：檢測器檢測到了3個detections，當前沒有任何tracks，將這3個detections初始化為tracks
Frame 1：檢測器又檢測到了3個detections，對于Frame 0中的tracks，先進行預測得到新的tracks，然后使用匈牙利算法將新的tracks與detections進行匹配，得到(track, detection)匹配對，最后用每對中的detection更新對應的track

檢測

使用Yolo作為檢測器，檢測當前幀中的bbox：

#  demo_yolo3_deepsort.py
def detect(self):
    while self.vdo.grab():
  ...
  bbox_xcycwh, cls_conf, cls_ids = self.yolo3(im)  # 檢測到的bbox[cx,cy,w,h]，置信度，類別id
  if bbox_xcycwh is not None:
          # 篩選出人的類別
          mask = cls_ids == 0
        bbox_xcycwh = bbox_xcycwh[mask]
        bbox_xcycwh[:, 3:] *= 1.2
         cls_conf = cls_conf[mask]
            ...

生成detections

將檢測到的bbox轉換成detections：

#  deep_sort.py
def update(self, bbox_xywh, confidences, ori_img):
    self.height, self.width = ori_img.shape[:2]
    # 提取每個bbox的feature
    features = self._get_features(bbox_xywh, ori_img)
    # [cx,cy,w,h] -> [x1,y1,w,h]
    bbox_tlwh = self._xywh_to_tlwh(bbox_xywh)
    # 過濾掉置信度小于self.min_confidence的bbox，生成detections
    detections = [Detection(bbox_tlwh[i], conf, features[i]) for i,conf in enumerate(confidences) if conf > self.min_confidence]
    # NMS (這里self.nms_max_overlap的值為1，即保留了所有的detections)
    boxes = np.array([d.tlwh for d in detections])
    scores = np.array([d.confidence for d in detections])
    indices = non_max_suppression(boxes, self.nms_max_overlap, scores)
    detections = [detections[i] for i in indices]
    ...

卡爾曼濾波預測階段

使用卡爾曼濾波預測前一幀中的tracks在當前幀的狀態：

#  track.py
def predict(self, kf):
    """Propagate the state distribution to the current time step using a 
       Kalman filter prediction step.
    Parameters
    ----------
    kf: The Kalman filter.
    """
    self.mean, self.covariance = kf.predict(self.mean, self.covariance)  # 預測
    self.age += 1  # 該track自出現以來的總幀數加1
    self.time_since_update += 1  # 該track自最近一次更新以來的總幀數加1

匹配

首先對基于外觀信息的馬氏距離計算tracks和detections的代價矩陣，然后相繼進行級聯匹配和IOU匹配，最后得到當前幀的所有匹配對、未匹配的tracks以及未匹配的detections：

#  tracker.py
def _match(self, detections):
    def gated_metric(racks, dets, track_indices, detection_indices):
        """
        基于外觀信息和馬氏距離，計算卡爾曼濾波預測的tracks和當前時刻檢測到的detections的代價矩陣
        """
        features = np.array([dets[i].feature for i in detection_indices])
        targets = np.array([tracks[i].track_id for i in track_indices]
  # 基于外觀信息，計算tracks和detections的余弦距離代價矩陣
        cost_matrix = self.metric.distance(features, targets)
  # 基于馬氏距離，過濾掉代價矩陣中一些不合適的項 (將其設置為一個較大的值)
        cost_matrix = linear_assignment.gate_cost_matrix(self.kf, cost_matrix, tracks, 
                      dets, track_indices, detection_indices)
        return cost_matrix


    # 區分開confirmed tracks和unconfirmed tracks
    confirmed_tracks = [i for i, t in enumerate(self.tracks) if t.is_confirmed()]
    unconfirmed_tracks = [i for i, t in enumerate(self.tracks) if not t.is_confirmed()]


    # 對confirmd tracks進行級聯匹配
    matches_a, unmatched_tracks_a, unmatched_detections = 
        linear_assignment.matching_cascade(
            gated_metric, self.metric.matching_threshold, self.max_age,
            self.tracks, detections, confirmed_tracks)


    # 對級聯匹配中未匹配的tracks和unconfirmed tracks中time_since_update為1的tracks進行IOU匹配
    iou_track_candidates = unconfirmed_tracks + [k for k in unmatched_tracks_a if
                                                 self.tracks[k].time_since_update == 1]
    unmatched_tracks_a = [k for k in unmatched_tracks_a if
                          self.tracks[k].time_since_update != 1]
    matches_b, unmatched_tracks_b, unmatched_detections = 
        linear_assignment.min_cost_matching(
            iou_matching.iou_cost, self.max_iou_distance, self.tracks,
            detections, iou_track_candidates, unmatched_detections)


    # 整合所有的匹配對和未匹配的tracks
    matches = matches_a + matches_b
    unmatched_tracks = list(set(unmatched_tracks_a + unmatched_tracks_b))


    return matches, unmatched_tracks, unmatched_detections




# 級聯匹配源碼  linear_assignment.py
def matching_cascade(distance_metric, max_distance, cascade_depth, tracks, detections, 
                     track_indices=None, detection_indices=None):
    ...
    unmatched_detections = detection_indice
    matches = []
    # 由小到大依次對每個level的tracks做匹配
    for level in range(cascade_depth):
  # 如果沒有detections，退出循環
        if len(unmatched_detections) == 0:  
            break
  # 當前level的所有tracks索引
        track_indices_l = [k for k in track_indices if 
                           tracks[k].time_since_update == 1 + level]
  # 如果當前level沒有track，繼續
        if len(track_indices_l) == 0: 
            continue


  # 匈牙利匹配
        matches_l, _, unmatched_detections = min_cost_matching(distance_metric, max_distance, tracks, detections, 
                                                               track_indices_l, unmatched_detections)


  matches += matches_l
  unmatched_tracks = list(set(track_indices) - set(k for k, _ in matches))
    return matches, unmatched_tracks, unmatched_detections

卡爾曼濾波更新階段

對于每個匹配成功的track，用其對應的detection進行更新，并處理未匹配tracks和detections：

#  tracker.py
def update(self, detections):
    """Perform measurement update and track management.
    Parameters
    ----------
    detections: List[deep_sort.detection.Detection]
                A list of detections at the current time step.
    """
    # 得到匹配對、未匹配的tracks、未匹配的dectections
    matches, unmatched_tracks, unmatched_detections = self._match(detections)


    # 對于每個匹配成功的track，用其對應的detection進行更新
    for track_idx, detection_idx in matches:
        self.tracks[track_idx].update(self.kf, detections[detection_idx])


  # 對于未匹配的成功的track，將其標記為丟失
  for track_idx in unmatched_tracks:
        self.tracks[track_idx].mark_missed()


    # 對于未匹配成功的detection，初始化為新的track
    for detection_idx in unmatched_detections:
        self._initiate_track(detections[detection_idx])


  ...

聲明：本文內容及配圖由入駐作者撰寫或者入駐合作網站授權轉載。文章觀點僅代表作者本人，不代表電子發燒友網立場。文章及其配圖僅供工程師學習之用，如有內容侵權或者其他違規問題，請聯系本站處理。舉報投訴

檢測器

檢測器

+關注

關注
1

文章
863

瀏覽量
47676
目標檢測

目標檢測

+關注

關注
0

文章
209

瀏覽量
15605
跟蹤算法

跟蹤算法

+關注

關注
0

文章
41

瀏覽量
13009

原文標題：目標跟蹤初探（DeepSORT）

文章出處：【微信號：vision263com，微信公眾號：新機器視覺】歡迎添加關注！文章轉載請注明出處。

基于labview的目標跟蹤

如何用labview編程實現目標框選跟蹤，camshift算法？請高手們幫幫忙，急求

發表于 03-18 10:47

基于OPENCV的運動目標跟蹤實現

CAMSHIFT算法是一種基于顏色直方圖的目標跟蹤算法。在視頻跟蹤過程中，CAMSHIFT算法利用選定目標的顏色直方圖模型得到每幀圖像的顏色投影圖，并根據上一幀

發表于 12-23 14:21

目標跟蹤中目標匹配的特征融合算法研究

基于特征融合的目標跟蹤中，目標的特征由于某些干擾導致準確度較低。基于貝葉斯框架的特征融合算法進行目標跟蹤時，不能達到最佳的

發表于 07-25 15:15 ?0次下載

基于KCFSE結合尺度預測的目標跟蹤方法

目標跟蹤是計算機視覺領域的一個基本問題，其主要應用于視頻監控，人機交與機器人視覺感知等場景。目標跟蹤可分為短時間目標

發表于 10-28 11:05 ?1次下載

基于KCFSE結合尺度預測的<b class='flag-5'>目標</b><b class='flag-5'>跟蹤</b>方法

機器人目標跟蹤

為了實現復雜環境下移動機器人目標跟蹤，提出多特征分塊匹配的跟蹤算法。該算法對目標區域進行分塊，利用顏色、深度特征對各塊圖像進行特征匹配，實現目標

發表于 11-07 17:29 ?14次下載

基于融合的快速目標跟蹤算法

提出了一種基于融合的快速目標跟蹤算法。該方法將目標預測模型、目標模板匹配以及目標空間信息融合到統一框架內。該方法通過預測模型，預測下一幀中

發表于 12-05 09:11 ?0次下載

基于TLD快速目標跟蹤

現實中目標在被長期跟蹤時容易發生形變、遮擋、光照干擾以及其他問題，現有跟蹤算法雖能解決該系列問題但算法計算量巨大，導致跟蹤系統實時性能較差，很難應用于實際場合。因此，準確快速

發表于 01-16 16:02 ?0次下載

簡單粗暴的多對象目標跟蹤神器–DeepSort

對象跟蹤問題一直是計算機視覺的熱點任務之一，簡單的可以分為單目標跟蹤與多目標跟蹤，最常見的目標

發表于 12-08 23:31 ?1112次閱讀

視頻目標跟蹤分析

視頻目標跟蹤要求在已知第一幀感興趣物體的位置和尺度信息的情況下，對該目標在后續視頻幀中進行持續的定位和尺度估計Ｗ。廣義的目標跟蹤通常包含單

發表于 07-05 11:24 ?1522次閱讀

最常見的目標跟蹤算法

對象跟蹤問題一直是計算機視覺的熱點任務之一，簡單的可以分為單目標跟蹤與多目標跟蹤，最常見的目標

發表于 09-14 16:20 ?2731次閱讀

經典多目標跟蹤算法DeepSORT的基本原理和實現

在開始介紹 DeepSORT 的原理之前呢，我們先來了解下目標檢測，和目標跟蹤之間的區別

發表于 04-23 09:43 ?2713次閱讀

經典多目標跟蹤算法DeepSORT的基本原理和實現

在開始介紹DeepSORT的原理之前呢，我們先來了解下目標檢測，和目標跟蹤之間的區別。

發表于 06-10 16:08 ?3596次閱讀

基于DeepSORT YOLOv4的目標跟蹤

電子發燒友網站提供《基于DeepSORT YOLOv4的目標跟蹤.zip》資料免費下載

發表于 06-27 11:20 ?0次下載