To better understand the rules, you may watch our match first

I'm part of the Vision Division within the Robomaster Club of Shanghai Jiao Tong University, where I have actively contributed to several key initiatives. Across all our projects except Robot Radar, we utilize the JETSON Orin NX developper kit to enable high performance computing.

The duty of Vision Division focuses on specifying tasks that Robots will do, rather than controlling them to accomplish these tasks. In practice, we have made up robotic commands, which are user-defined and transferred via an UART communication module to a Micro-Computing Unit (MCU). The MCU handles the deserialization of these instructions, contributing to tactile robot manipulations in real time. All algorithmic responsibilities on the MCU-front falls under the remit of our club's Electronic Control Division—a team with which we consistently collaborate.

AuotAim Module

The AutoAim Module represents one of our division's paramount contributions, and it serves as a cornerstone for majority of the robots we design and deploy. Its relevance becomes particularly prominent during Robomaster competitions where robots engage in combative interactions by aiming shots at each other's armored plates. We've enhanced our shooting precision by incorporating a Kalman Filter into this module. This predictive system gives us an upper hand as it forecasts potential routes of robot movement, thus offering valuable insights on the prospective locations of the armor plates. Following successful filtration, our shooting commands are transmitted to the MCU which then aligns, fires bullets with high accuracy, allowing significant strategic advantages over opponents.

Notably, all our robotic models except the Radar come equipped with this integral AutoAim Module.

Robot Sentry

The Robot Sentry project involves designing an autonomous robotic system. The diagram below illustrates the overall structure.

Sentry Architecture

Beginning from the bottom, we have the Hardware Layer consisting of vision-related sensors like the Livox Lidar Mid360 and Hikrobot MV-CS016-10UC industrial camera fitted with a 6mm lens. Note that only vision-related are covered in the diagram. For example, IMU is ommitted because IMU is not vision-related and is processed by the MCU.

Above the hardware stratum is the Localization Layer that houses the LIDAR-IMU Odometry (LIO), which works using the Iterative Closest Point Alignment technique to make accurate, real-time estimations of pose alignment between lidar points and pre-generated point cloud, initiating rough estimation utilizing IMU data. To optimize pose estimation, we use one algo variant, namely Fast-LIO, which filters and smooths out these computations. In detail, we have generated a .pcd type lidar point cloud object with FAST-LIO mapping. We then down-sampled the point cloud so as to further decrease the time-complexity of point cloud matching. Then we leverage re-localization feature for pinpointing the robot.

Next comes the Navigation Layer, built upon the ROS movebase package. A costmap, which is basically a 2D .pgm map (AKA static layer) overlayed with layers such as obstacle and inflation. An abstraction into a costmap graph created by a global planner lets us guide the route progression through A* search algorithm. Parallelly, unplanned events like swiftly appearing pedestrians are tackled by a local planner, necessitating the inclusion of a separate obstacle layer for imminent challenges not originally present in the static map layer. We should note that simply considering depth as obstacles may misinterpret slopes as obstacles. That's why we utilized ground segmentation algorithm to segement all the planes, including slopes.

Atop these systems is the Decision Layer, where a multi-state behavior tree manipulates navigation based on inputs fetched from a communal information space called 'BlackBoard'. This serves as data storehouse where information passed across via UART communication resides. Our behavioural tree employs components including Sequence Node, Loop Node, Action Node and Task Node—a novel component we created to function silently in the background. The usage varies for each node; Sequence nodes prioritize different tasks and halt if a task fails whereas Loop Nodes facilitates patrolling across multiple position.

As a whole, Robot Sentry's expansive nature contributed significantly towards my learning experience—particularly when it came to debugging large-scale systems iteratively.

Robot Aerial

In the Aerial project, our team established a fundamental setup for indoor localization using Visual Inertial Odometry (VIO) powered by PX4. By adhering to official guides, we employed the ROS package MAVROS as a key conduit, allowing ROS1 topics to interact with MAVLink and enabling effective communication with PX4. For VIO functionality, we utilized Intel's RealSense T265 technology.

Through this scheme of operations, we successfully accomplished automatic self-positioning capabilities for our aerial robot model. This marked a significant step in enhancing drone autonomy within enclosed environments.

Robot Radar

As part of this project, we developed a comprehensive process to pinpoint the precise locations of visible robots within the world frame—an accomplishment realized through advanced lidar-camera fusion techniques. We delved into and tested several procedures for calibrating the camera frame and the lidar frame with respect to the global coordinate system. Making use of classical tools such as Perspective-n-Point (PnP) for cameras and Iterative Closest Point (ICP) for lidars, we yielded results of exceptional accuracy.

The detection and tracking of all visible robots was a crucial aspect of our project execution. We refined our systems by employing object detection mechanisms using M-detector (movable object detection via lidar point clouds) and YOLOv9 in conjunction with ByteTrack. Once robots were accurately detected and tracked, their respective positions could be determined by performing frame transformations that leveraged the meticulous calibration systems we'd put in place.

However, the responsibilities accompanying my role as Project Leader extended beyond technical advancements alone. I dedicated considerable time to manage our team efficiently, ensuring smooth communication channels between members, setting realistic deadlines, anticipating potential roadblocks, and having contingency plans ready beforehand. My focus on effective project management strategies also ensured timely completion of milestones, promoting proactive practices to prevent last-minute hitches or stressful work crunches.

It was an enriching experience to lead and collaborate with proficient minds collectively aimed at pushing boundaries of the knowledge and equipments we have. Assembling these lessons from the ground up honed my leadership skills—guiding me toward broader understandings of teamwork dynamics while instilling humility from hands-on problem-solving experiences.

  • bitXor

异或操作可以看作是不进位加法,但是似乎对本问没有任何作用。

于是,还是只能按照异或的定义:按位,相同为 0,不同为 1

我们可以将上面的操作写作 (x & ~y) | (~x | y),接下来要做的就是把或操作拆分成与和非操作。

考虑逻辑运算 \[ A \vee B = \neg \neg (A \vee B) = \neg (\neg A \wedge \neg B) \]

于是,我们就可以用给定的运算符来获得

  • isTmax

我们先考察最大值,即 0x7fffffff 有一个性质:~x = x + 1

很遗憾,这不是充分条件,但好在有且只有一个反例 0xffffffff,左右两边都是 \(0\),因此特判即可。

后续的部分已经完成,题解待补,先放代码。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
int bitXor(int x, int y) {
return ~((~(x & ~y)) & (~(~x & y)));
}

int tmin(void) {
return (1 << 31);
}

int isTmax(int x) {
return !!(x + 1) & !(~x ^ (x + 1));
}

int allOddBits(int x) {
return !(((x & 0xAA) & ((x >> 8) & 0xAA) & ((x >> 16) & 0xAA) & ((x >> 24) & 0xAA)) ^ 0xAA);
}

int negate(int x) {
return ~x + 1;
}

int isAsciiDigit(int x) {
return !(x >> 8) & !((x >> 4) ^ 0x03) & !(((x + 0x06) >> 4) ^ 0x03);
}

int conditional(int x, int y, int z) {
int mask = !x;
int sum = z + ~y + 1;
mask = mask | (mask << 1);
mask = mask | (mask << 2);
mask = mask | (mask << 4);
mask = mask | (mask << 8);
mask = mask | (mask << 16);
return (mask & sum) + y;
}

int isLessOrEqual(int x, int y) {
int sgn_x = x >> 31 & 1, sgn_y = y >> 31 & 1;
return (sgn_x & !sgn_y) | !((sgn_y & !sgn_x) | !((x + (~(y + 1) + 1)) >> 31 & 1));
}

int logicalNeg(int x) {
int x1 = (x >> 16) | x;
int x2 = (x1 >> 8) | x1;
int x3 = (x2 >> 4) | x2;
int x4 = (x3 >> 2) | x3;
int x5 = (x4 >> 1) | x4;
return (~x5 & 1);
}

int howManyBits(int x) {
int ans = 0;

x = (x >> 1) ^ x;

ans = ans + ((!!(x >> 15)) << 4);

ans = ans + ((!!(x >> (ans + 7))) << 3);

ans = ans + ((!!(x >> (ans + 3))) << 2);

ans = ans + ((!!(x >> (ans + 1))) << 1);

ans = ans + (!!(x >> ans));

return ans + 1;
}

相机选型

相机的分辨率选择方式

靶面尺寸:表示 CMOS 传感器的大小,CMOS 是相机的感光元件

像元尺寸:图片的一个像素对应着 CMOS 的尺寸

分辨率:单位是像素,表示图片有多少个像素构成,我们可以查看下面这张表

image-20240113094508405
image-20240113094520772

对于第二张图,我们很容易验证: \[ 9344\ \mathrm{pixel} \times 3.2 \ \mathrm{\mu m/pixel} = 29.9 \ \mathrm{mm}\\ 7000\ \mathrm{pixel} \times 3.2 \ \mathrm{\mu m/pixel} = 22.4 \ \mathrm{mm}\\ \] 但是第一张图呢?靶面尺寸是 \(2/3\) 英寸是什么意思?这是英国人的度量方式,表示感光元件对角线的长度,例如相机上的 \(1\) 英寸指的是对角线 \(16\mathrm{mm}\) 的 CMOS。

img

目前在选购相机的过程中,只需要按照下面顺序即可排除到个位数个相机:

  1. 是不是彩色相机
  2. 最大帧率
  3. 相机画幅(也就是我们刚才提到的靶面尺寸)
  4. 接口协议

可以得到对于相等的画幅,像素越多,呈像质量越高,但这个结论也不是绝对的,当无限大下去的时候,会产生噪点,反而影响呈像质量;

于是,在上面的基础上,我们可以说对于相同的像素,画幅越大,呈像质量就越好,因为每个像素工作时都产生电磁场对周围像素的干扰作用会减弱,从而减少坏点的产生。

2024.1.26 upd:今天到了几个新的相机,我们的需求是能看清远端的车辆装甲板,但是在购买时只是一味地追求相机分辨率,但没有考虑相机对应镜头的 FOV。测试结果发现,尽管新相机分辨率较大,但是镜头对应的 FOV 同样也大,最终导致远处的一块装甲板内部的像素点数甚至不如分辨率没那么大的旧相机。

如何应对频闪?

区分 帧率 和 曝光时间(或者说快门速度)之间的区别

帧率代表的是每一秒拍摄多少张照片;曝光时间代表的是在拍摄一张照片的周期内,有多少时间是快门打开的,有多少时间是快门关闭的。

现在对于中国,交流电的频率是 50 Hz,也就是正常的灯光会以 100 Hz 的频率闪烁,具体的亮度曲线是 \(y=|\sin x|\)

那么我们就有下面的结论,可以有效规避拍摄出现频闪现象。 - 帧率的整数倍是闪烁频率:在这种情况下,拍摄每一张照片的周期一定包含了完整的整数个灯光闪烁的周期,不论曝光时间如何变化,积得的亮度总是相同。

例如:对于 100 Hz 闪烁的灯光,帧率应当为 25 Hz、50 Hz 或 100 Hz。 - 快门速度应当是闪烁周期的整数倍:在这种情况下,曝光时间总是包含完整的整数个灯光闪烁的周期,积得德亮度总是相同。

例如:对于 100 Hz 闪烁的灯光,快门速度应当为 1/100、1/50 或 1/25。

考虑以下代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <iostream>
#include <memory>
using namespace std;

shared_ptr<int> p;

void setp(const int& b) {
p = make_shared<int>(b);
}

int main() {
int c = 10;
setp(c);
cout << *p << endl;
c = 20;
cout << *p << endl;
return 0;
}

你能说出它的输出是什么吗?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#include <iostream>
#include <memory>
using namespace std;

shared_ptr<int> p;

void setp(const shared_ptr<int>& b) {
p = b;
}

int main() {
int c = 10;
shared_ptr<int> c_ptr = make_shared<int>(c);
setp(c_ptr);
cout << *p << endl;
*c_ptr = 20;
cout << *p << endl;
return 0;
}

答案: 第一段代码:

1
2
10
10

第二段代码:

1
2
10
20

为什么会产生区别呢?原来 make_shared<T>(x) 等价于以 \(x\) 为初值构造一个指针,而不是构造一个绑定到 \(x\) 上的指针。

第一段代码

前言:

最近在做相机-激光雷达(MID70)联合标定的时候,采用港大火星实验室的一个自动标定工作,为了优化表现,于是开始研究里面的参数。

我注意到一个参数名称是 Ransac,ransac.min_dis

RANSAC 算法,又称 Random Sample Concensus,是为了寻找一组采样点的 outlier 的随机迭代算法。

RANSAC 的随机性,说明它是一个 non-deterministic 算法,它相对于最小二乘拟合法的优势在于:RANSAC 考虑了 inlier 和 outlier 对模型的影响。最小二乘法将所有数据点都考虑到模型当中,对于下面的示例,最小二乘法会被噪点带偏,然而 RANSAC 能够区分数据点中 inlier 和 outlier 这两个不同的部分,从而能够成功应对这种情况。

最小二乘法的限制

算法流程

RANSAC 算法的流程也很容易理解,具体如下:

  1. 从全部数据点中选取一部分,我们将假设它包含于 inlier 点集

  2. 对选取的点集进行模型拟合(这里的模型可以根据自己的需要选取,比如线性回归)

  3. 根据一个阈值,寻找与模型估计在误差范围内的所有点

  4. 判断上一步找到的点的个数是否大于阈值,如果是,则计算 fitness,并更新最优解

  5. 回到 1 进行新的一轮迭代

这个算法在雷达相机匹配中的作用是,分割出一个体素中存在的平面。为什么可以实现这个功能?

只要把点集的维度拓宽为 3 维,模型从二维线性回归变到三维线性回归。

最坏情况分析

事实上,我们可以证明模型的迭代次数越多,1 就越有可能获得全部位于 inlier 中的点,从而获得较优的模型估计。证明:

\(\omega = \frac{\text{inliers num}}{\text{total sample points num}}\),设 \(p\) 为经过 \(k\) 轮迭代,每轮采样点有 \(n\) 个的情况下,至少有一次获得全部位于 inlier 中点的概率。

\[ \begin{aligned} 1-p &= (1-\omega^n)^k \\ k &= \frac{\log (1-p)}{\log (1-\omega^n)} \end{aligned} \]

不过,上面的计算假设了每次抽样是独立的(可以放回)

参数的整定

根据算法流程,我们发现主要有两个关键参数

  • Ransac.dis_threshold 可以通过下图直观理解

    其中 \(d\) 代表误差,这会影响算法流程中第 3 步。这个参数的整定取决于 inlier 的方差。

    dis_threshold
  • Ransac.iter_num 需要根据上一小节进行合理分析

    注意到 \(\omega\) 越小的情况下,我们应该迭代更多次,才能让结果收敛。

埃氏筛代码:

1
2
3
4
5
6
7
bool vis[N];
void Eratosthenes() {
for (int i = 2; i <= N; i++) {
if (vis[i]) continue; // 合数就下一个
for (int j = i; j <= N / i; j++) vis[i * j] = true;
}
}

约定:为了方便表述,规定本文之后出现的所有 \(p\) 都是素数

我们的目标是求出: \[ \sum_{p\le \sqrt N} \left(\frac Np -p\right) \] 上面的和式分为两部分,\(\sum\limits_{p\le \sqrt N} \dfrac{N}{p}\)\(\sum\limits_{p\le \sqrt N}p\)

下面的推导基于素数定理:设 \(\pi (x)\) 表示不大于 \(x\) 的数中有多少个素数,则 \(\pi(x) \sim \dfrac{x}{\ln x}\)

则我们知道 \(x\) 为素数的概率为 \(\pi(x) - \pi(x-1) \approx \dfrac 1{\ln x}\)(这里实际上是期望,但由于只有一个数 \(x\),因此期望等于概率),这样,只要乘上概率,我们就能将对 \(p\) 的求和转化为连续的 \(x\) 的求和了!

先算第一部分

\[ \sum_{p\le \sqrt N} \frac Np =\sum_{x=2}^{\sqrt N} \frac{N}{x\ln x}=N\sum_{x=2}^{\sqrt{N}} \frac{1}{x\ln x} \]

接下来就要求这个含有 \(x\) 的和式,根据套路,我们可以使用积分近似\[ \sum_{x=2}^{\sqrt N} \frac1{x\ln x} \approx \int_{2}^ \sqrt N \frac{1}{x\ln x} \mathrm{d}x \] 这是个经典的换元法解决的积分式。不妨设 \(u = \ln x\),两边求导得 \(\mathrm{d} u = \dfrac 1x \mathrm{d} x\),发现这一项正好出现在积分式中,直接代入: \[ \int \frac 1{x\ln x} \mathrm{d} x=\int \frac 1u \mathrm{d}u= \ln u +C=\ln\ln x+C \] 于是,将这个定积分代回到之前的式子中,第一部分的近似值为 \(O(N\ln\ln \sqrt N) =O(N\ln\ln N)\)

再算第二部分

经过我的尝试,我使用算两次方法来解决这个问题。

第一种计算方式与第一部分的相同: \[ \sum_{p\le \sqrt N} p = \sum_{x=2}^{\sqrt{N}} \frac x {\ln x} \] 第二种计算方式是将每个 \(p\) 拆分成 \(\sum_x [1\le x\le p]\) 的形式,计算贡献(类似阿贝尔变换): \[ \begin{aligned} \sum_{p\le \sqrt N}p &= \sum_{x=0}^{\sqrt N}\left(\pi(\sqrt N) - \pi(x)\right)= (\sqrt N + 1)\times \pi (\sqrt N) - \sum_{x=2}^{\sqrt{N}}\pi(x) \\&= (\sqrt N + 1)\times \frac {2\sqrt N}{\ln N} -\sum_{x=2}^{\sqrt N} \frac x{\ln x} \end{aligned} \] 两种计算方法联立,可以解得第二部分的近似值 \(O(\dfrac N {\ln N})\) \[ \sum_{x = 2}^{\sqrt N} \frac x {\ln x} \approx \frac{N}{\ln N} \]

合并两部分

所以埃氏筛法的总复杂度为第一部分减去第二部分: \[ O(N\ln\ln N-\frac N{\ln N}) = O(N\ln\ln N) \] 得证!

工欲善其事,必先利其器。

现象

当光标悬浮到 opencv2/opencv.hpp 上的时候,会得到如下报错: xmmintrin.h In included file: definition of builtin function '_mm_getcsr'

虽然这个问题无关痛痒,因为 clangd 关于 cv 的提示仍然可以使用,但是使用 OpenCV 的文件就会被标红,对于我这样的强迫症来说,这是非常难以接受的。

解决过程

刚开始查网上资料,网友对此的解释是 gcc 的库 clang 识别不了,大家的解决方式是修改 clangd 配置的,尝试使用 Disagnostics 的 Suppress 功能抑制 builtin_definition 的报错提示,然后对于我的情况而言,这样是行不通的,于是这件事一直被耽搁了。

突然有一天,我发现当 clangd 读不到 compile_commands.json 的时候,根据我 Fallback Flags 的配置

1
2
3
4
-I/opt/ros/noetic/include
-I/usr/include/eigen3
-I/usr/include/pcl-1.10
-I/usr/include/opencv4
OpenCV 头文件上的红线报错就消失了,我十分的不解,于是就开始打开 VsCode 的 Output 查看在这个情况下 clangd 的编译命令和我的 compile_commands.json 有何不同。

结果可以发现,默认情况下 clangd 使用的编译器是 /usr/bin/clang,尽管我甚至没有装 clang,这个目录也是没有任何东西的,而从 compile_commands.json 中得到的编译器是 /usr/bin/c++。

再仔细研究一下我发现,在使用 /usr/bin/clang 这个虚空编译器的时候,xmmintrin.h 被导航到的目录不再是 gcc 下的 include 中,而是 /usr/lib/llvm-12/lib/clang/12.0.0/include/xmmintrin.h,即 clangd 的库中。一看,发现这个 clang 和 gcc 下的这个文件有着不同的实现,那大抵能够解释了。

解决方案

直接在 clangd.arguments 中,设置 --query_driver=/usr/bin/clang 即可。

此外,为了能够进入 OpenCV 头文件后还能够正常导航,我们需要在 $HOME/.config/clangd/config.yaml 的 CompileFlag 中加入 -I/usr/include/opencv4。

思考

很多事情其实通过自己的思考是可以解决的,当某些东西在网上几乎查不到解决方案的时候,不妨试着自己去做做。

clangd 究竟是如何编译文件的呢?为何我使用的虚空编译器也能正常让其运作?

0%